블로그 | 모하지?

원문: Koog Documentation — handling-failures 이 글은 Koog 공식 문서의 handling-failures 페이지를 한국어로 옮긴 번역본입니다. 문서 구조와 링크 의미를 유지하되, MkDocs 전용 UI 문법은 블로그에서 읽기 좋도록 정리했습니다.

실패 처리

이 페이지에서는 내장된 재시도 및 시간 초과 메커니즘을 사용하여 LLM 클라이언트 및 프롬프트 실행기의 실패를 처리하는 방법을 설명합니다.

재시도 기능

LLM 제공업체와 협력할 때 요금 제한이나 일시적인 서비스 이용 불가능과 같은 일시적인 오류가 발생할 수 있습니다. RetryingLLMClient 데코레이터는 Kotlin과 Java 모두에서 LLM 클라이언트에 자동 재시도 논리를 추가합니다.

기본 사용법

재시도 기능으로 기존 클라이언트를 래핑합니다.

코틀린

1// Wrap any client with the retry capability2val client = OpenAILLMClient(apiKey)3val resilientClient = RetryingLLMClient(client)45// Now all operations will automatically retry on transient errors6val response = resilientClient.execute(prompt, OpenAIModels.Chat.GPT4o)

자바

1OpenAILLMClient client = new OpenAILLMClient(apiKey);2RetryingLLMClient resilientClient = new RetryingLLMClient(client);34// Now all operations will automatically retry on transient errors5List<Message.Response> response = resilientClient.execute(prompt, OpenAIModels.Chat.GPT4o);

재시도 동작 구성

기본적으로 RetryingLLMClient은 최대 3번의 재시도, 1초의 초기 지연, 최대 지연 시간은 30초입니다. RetryingLLMClient에 전달된 RetryConfig을 사용하여 다른 재시도 구성을 지정할 수 있습니다. 예를 들어:

코틀린

1// Use the predefined configuration2val conservativeClient = RetryingLLMClient(3    delegate = client,4    config = RetryConfig.CONSERVATIVE5)

자바

1OpenAILLMClient client = new OpenAILLMClient(apiKey);2// Use the predefined configuration3RetryingLLMClient conservativeClient = new RetryingLLMClient(4    client,5    RetryConfig.Companion.getCONSERVATIVE()6);

Koog는 Kotlin의 RetryConfig 및 Java의 RetryConfig.Companion을 통해 사용할 수 있는 몇 가지 사전 정의된 재시도 구성을 제공합니다.

구성(Kotlin)	최대 시도 횟수	초기 지연	최대 지연	사용 사례
`RetryConfig.DISABLED`	1(재시도 없음)	-	-	개발, 테스트 및 디버깅.
`RetryConfig.CONSERVATIVE`	3	2초	30대	속도보다 안정성이 더 중요한 백그라운드 또는 예약된 작업입니다.
`RetryConfig.AGGRESSIVE`	5	500ms	20대	API 호출을 줄이는 것보다 일시적인 오류로부터 빠른 복구가 더 중요한 중요한 작업입니다.
`RetryConfig.PRODUCTION`	3	1초	20대	일반 생산용.

직접 사용하거나 사용자 정의 구성을 만들 수 있습니다.

1// Or create a custom configuration2val customClient = RetryingLLMClient(3    delegate = client,4    config = RetryConfig(5        maxAttempts = 5,6        initialDelay = 1.seconds,7        maxDelay = 30.seconds,8        backoffMultiplier = 2.0,9        jitterFactor = 0.210    )11)

재시도 오류 패턴

기본적으로 RetryingLLMClient는 일반적인 일시적 오류를 인식합니다. 이 동작은 RetryConfig.retryablePatterns 패턴에 의해 제어됩니다. 각 패턴은 다음과 같이 표현됩니다. RetryablePattern 실패한 요청의 오류 메시지를 확인하고 재시도해야 하는지 여부를 결정합니다.

Koog는 지원되는 모든 LLM 공급자에서 작동하는 사전 정의된 재시도 구성 및 패턴을 제공합니다. 기본값을 유지하거나 특정 요구 사항에 맞게 사용자 정의할 수 있습니다.

패턴 유형

다음 패턴 유형을 사용하고 원하는 수만큼 결합할 수 있습니다.

RetryablePattern.Status: 오류 메시지의 특정 HTTP 상태 코드와 일치합니다(예: 429, 500,502 등).
RetryablePattern.Keyword: 오류 메시지의 키워드와 일치합니다(예: rate limit 또는 request timeout).
RetryablePattern.Regex: 오류 메시지의 정규식과 일치합니다.
RetryablePattern.Custom: 람다 함수를 사용하여 사용자 지정 논리를 일치시킵니다.

패턴이 true을 반환하는 경우 오류는 재시도 가능한 것으로 간주되며 LLM 클라이언트는 요청을 재시도합니다.

기본 패턴

재시도 구성을 사용자 지정하지 않는 한 기본적으로 다음 패턴이 사용됩니다.

HTTP 상태 코드:
429: 비율 제한
500: 내부 서버 오류
502: 잘못된 게이트웨이
503: 서비스 이용 불가
504: 게이트웨이 시간 초과
529: 인류 과부하
오류 키워드:
비율 제한
요청이 너무 많습니다
요청 시간 초과
연결 시간 초과
읽기 시간 초과
쓰기 시간 초과
피어에 의한 연결 재설정
연결이 거부됨
일시적으로 이용할 수 없음
서비스 이용 불가

이러한 기본 패턴은 Koog에서 RetryConfig.DEFAULT_PATTERNS으로 정의됩니다.

맞춤 패턴

특정 요구 사항에 맞게 사용자 정의 패턴을 정의할 수 있습니다.

1val config = RetryConfig(2    retryablePatterns = listOf(3        RetryablePattern.Status(429),   // Specific status code4        RetryablePattern.Keyword("quota"),  // Keyword in error message5        RetryablePattern.Regex(Regex("ERR_\\d+")),  // Custom regex pattern6        RetryablePattern.Custom { error ->  // Custom logic7            error.contains("temporary") && error.length > 208        }9    )10)

또한 기본 RetryConfig.DEFAULT_PATTERNS에 사용자 정의 패턴을 추가할 수도 있습니다.

1val config = RetryConfig(2    retryablePatterns = RetryConfig.DEFAULT_PATTERNS + listOf(3        RetryablePattern.Keyword("custom_error")4    )5)

재시도를 통한 스트리밍

스트리밍 작업은 선택적으로 다시 시도할 수 있습니다. 이 기능은 기본적으로 비활성화되어 있습니다.

1val config = RetryConfig(2    maxAttempts = 33)45val client = RetryingLLMClient(baseClient, config)6val stream = client.executeStreaming(prompt, OpenAIModels.Chat.GPT4o)

참고 스트리밍 재시도는 첫 번째 토큰이 수신되기 전에 발생한 연결 실패에만 적용됩니다. 스트리밍이 시작되면 재시도 논리가 비활성화됩니다. 스트리밍 중 오류가 발생하면 작업이 종료됩니다.

프롬프트 실행기로 재시도

프롬프트 실행기로 작업할 때 Kotlin과 Java 모두에서 실행기를 생성하기 전에 재시도 메커니즘으로 기본 LLM 클라이언트를 래핑할 수 있습니다. 프롬프트 실행자에 대해 자세히 알아보려면 Prompt executors을 참조하세요.

코틀린

1// Single provider executor with retry2val resilientClient = RetryingLLMClient(3    OpenAILLMClient(System.getenv("OPENAI_API_KEY")),4    RetryConfig.PRODUCTION5)6val executor = MultiLLMPromptExecutor(resilientClient)78// Multi-provider executor with flexible client configuration9val multiExecutor = MultiLLMPromptExecutor(10    LLMProvider.OpenAI to RetryingLLMClient(11        OpenAILLMClient(System.getenv("OPENAI_API_KEY")),12        RetryConfig.CONSERVATIVE13    ),14    LLMProvider.Anthropic to RetryingLLMClient(15        AnthropicLLMClient(System.getenv("ANTHROPIC_API_KEY")),16        RetryConfig.AGGRESSIVE  17    ),18    // The Bedrock client already has a built-in AWS SDK retry 19    LLMProvider.Bedrock to BedrockLLMClient(20        identityProvider = StaticCredentialsProvider {21            accessKeyId = System.getenv("AWS_ACCESS_KEY_ID")22            secretAccessKey = System.getenv("AWS_SECRET_ACCESS_KEY")23            sessionToken = System.getenv("AWS_SESSION_TOKEN")24        },25    ),26)

자바

1// Single provider executor with retry (Java)2RetryingLLMClient resilientClient = new RetryingLLMClient(3    new OpenAILLMClient(System.getenv("OPENAI_API_KEY")),4    RetryConfig.Companion.getPRODUCTION()5);67MultiLLMPromptExecutor executor = new MultiLLMPromptExecutor(resilientClient);89// Multi-provider executor with flexible client configuration (Java)10LLMClient openai = new RetryingLLMClient(11    new OpenAILLMClient(System.getenv("OPENAI_API_KEY")),12    RetryConfig.Companion.getCONSERVATIVE()13);1415LLMClient anthropic = new RetryingLLMClient(16    new AnthropicLLMClient(System.getenv("ANTHROPIC_API_KEY")),17    RetryConfig.Companion.getAGGRESSIVE()18);1920Map<LLMProvider, LLMClient> clients = Map.of(21    LLMProvider.OpenAI, openai,22    LLMProvider.Anthropic, anthropic23);2425MultiLLMPromptExecutor multiExecutor = new MultiLLMPromptExecutor(clients);

시간 초과 구성

모든 LLM 클라이언트는 요청 중단을 방지하기 위해 Kotlin과 Java 모두에서 시간 초과 구성을 지원합니다. 다음을 사용하여 클라이언트를 생성할 때 네트워크 연결에 대한 시간 초과 값을 지정할 수 있습니다. ConnectionTimeoutConfig 클래스.

ConnectionTimeoutConfig에는 다음과 같은 속성이 있습니다.

재산	기본값	설명
`connectTimeoutMillis`	60초(60,000)	서버에 연결하는 데 걸리는 최대 시간입니다.
`requestTimeoutMillis`	15분(900,000)	전체 요청이 완료되는 데 걸리는 최대 시간입니다.
`socketTimeoutMillis`	15분(900,000)	설정된 연결을 통해 데이터를 기다리는 최대 시간입니다.

특정 요구 사항에 맞게 이러한 값을 사용자 정의할 수 있습니다. 예를 들어:

코틀린

1val client = OpenAILLMClient(2    apiKey = apiKey,3    settings = OpenAIClientSettings(4        timeoutConfig = ConnectionTimeoutConfig(5            connectTimeoutMillis = 5000,    // 5 seconds to establish connection6            requestTimeoutMillis = 60000,    // 60 seconds for the entire request7            socketTimeoutMillis = 120000   // 120 seconds for data on the socket8        )9    )10)

자바

1String apiKey = System.getenv("OPENAI_API_KEY");2ConnectionTimeoutConfig timeouts = new ConnectionTimeoutConfig(3    5000L,   // connectTimeoutMillis4    60000L,  // requestTimeoutMillis5    120000L  // socketTimeoutMillis6);7OpenAIClientSettings settings = new OpenAIClientSettings(8    "https://api.openai.com", // baseUrl9    timeouts,10    "v1/chat/completions",    // chatCompletionsPath11    "v1/responses",           // responsesAPIPath12    "v1/embeddings",          // embeddingsPath13    "v1/moderations",         // moderationsPath14    "v1/models"               // modelsPath15);16OpenAILLMClient client = new OpenAILLMClient(apiKey, settings);

참고 장기 실행 또는 스트리밍 호출의 경우 requestTimeoutMillis 및 socketTimeoutMillis에 대해 더 높은 값을 설정하십시오.

오류 처리

프로덕션에서 LLM으로 작업할 때 다음을 포함하여 오류 처리를 구현해야 합니다.

예상치 못한 오류를 처리하려면 Try-catch 블록을 사용하세요.
디버깅을 위해 컨텍스트와 함께 오류 로깅
중요한 작업을 위한 대체.
재시도 패턴을 모니터링하여 반복되는 문제를 식별합니다.

다음은 Kotlin 및 Java의 오류 처리 예입니다.

코틀린

1val logger = LoggerFactory.getLogger("Example")2val resilientClient = RetryingLLMClient(3    OpenAILLMClient(System.getenv("OPENAI_API_KEY")),4    RetryConfig.PRODUCTION5)6val prompt = prompt("test") { user("Hello") }7val model = OpenAIModels.Chat.GPT4o89fun processResponse(response: Any) { /* implmenentation */ }10fun scheduleRetryLater() { /* implmenentation */ }11fun notifyAdministrator() { /* implmenentation */ }12fun useDefaultResponse() { /* implmenentation */ }1314try {15    val response = resilientClient.execute(prompt, model)16    processResponse(response)17} catch (e: Exception) {18    logger.error("LLM operation failed", e)1920    when {21        e.message?.contains("rate limit") == true -> {22            // Handle rate limiting specifically23            scheduleRetryLater()24        }25        e.message?.contains("invalid api key") == true -> {26            // Handle authentication errors27            notifyAdministrator()28        }29        else -> {30            // Fall back to an alternative solution31            useDefaultResponse()32        }33    }34}

자바

1Logger logger = LoggerFactory.getLogger("Example");2RetryingLLMClient resilientClient = new RetryingLLMClient(3        new OpenAILLMClient(System.getenv("OPENAI_API_KEY")),4        RetryConfig.PRODUCTION5);6Prompt prompt = Prompt.builder("test")7        .user("Hello")8        .build();9MultiLLMPromptExecutor promptExecutor = new MultiLLMPromptExecutor(resilientClient);1011Consumer<List<Message.Response>> processResponse = (resp) -> { /* implementation */ };12Runnable scheduleRetryLater = () -> { /* implementation */ };13Runnable notifyAdministrator = () -> { /* implementation */ };14Runnable useDefaultResponse = () -> { /* implementation */ };1516try {17    List<Message.Response> response = promptExecutor.execute(prompt, OpenAIModels.Chat.GPT4o);18    processResponse.accept(response);19} catch (Exception e) {20    logger.error("LLM operation failed", e);21    String msg = e.getMessage() == null ? "" : e.getMessage().toLowerCase();22    if (msg.contains("rate limit")) {23        scheduleRetryLater.run();24    } else if (msg.contains("invalid api key")) {25        notifyAdministrator.run();26    } else {27        useDefaultResponse.run();28    }29}