Add HTTP retry logic and cancellation support for OpenAI services #58

lemillermicrosoft · 2023-03-10T22:35:09Z

Motivation and Context

This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems:

The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic.
The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests.

Description

Kernel

Added a new abstraction for creating HTTP retry handlers, the IDelegatingHandlerFactory interface, which allows using the built-in HttpClient retry logic and injecting custom retry policies.
Added a new class DefaultHttpRetryHandlerFactory that implements a configurable retry policy based on the KernelConfig.HttpRetryConfig settings. This factory is used by default when creating a kernel, unless a different factory is specified.
Added a new class NullHttpRetryHandlerFactory that creates a no-op retry handler, useful for testing or disabling retries.
Added support for cancellation tokens and retry handlers in the OpenAI services classes, such as AzureTextCompletion, AzureTextEmbeddings, and OpenAITextCompletion. This allows the callers to cancel the requests and handle transient failures more gracefully.
Added unit tests for the DefaultHttpRetryHandler class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters.
Removed the unused IRetryMechanism and PassThroughWithoutRetry classes, and the Example08_RetryMechanism.cs file.
Refactored the KernelConfig class to expose the HttpRetryConfig and the HttpHandlerFactory properties.
Refactored the Kernel class to use the HttpHandlerFactory instead of the RetryMechanism for invoking the pipeline functions.
Updated the KernelBuilder sample to show how to use a custom retry handler factory.
Updated the documentation and the code style accordingly.

Kernel-Syntax-Examples

This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class, RetryThreeTimesWithBackoff, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class, RetryThreeTimesWithRetryAfterBackoff, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface.

The pull request also modifies the ConsoleLogger class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes.
This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows SK Contribution Guidelines (https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
The code follows the .NET coding conventions (https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions) verified with dotnet format
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs

dotnet/src/IntegrationTest/AI/OpenAICompletionTests.cs

dotnet/src/SemanticKernel/Configuration/KernelConfig.cs

samples/dotnet/KernelBuilder/Program.cs

dluc

There's a lot of code and tests, the code needs some more comments. Some minor improvements before merging.
FYI there's a plan to adopt Azure OpenAI SDK, and I hope we can still leverage this work, maybe we should hand it over.
Overall this is pretty good, and a lot of code, I think this is as far as we should go, otherwise we should just take a dependency on something off the shelf.

dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs

jansenbe · 2023-03-13T19:13:43Z

@lemillermicrosoft : good to see some my original ideas and code pieces being reused :-) Your implementation however is much more cleaner and flexible, looks really good!!

dluc · 2023-03-14T01:36:20Z

About response draining and the special code to clone requests: looking at https://learn.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-http-call-retries-exponential-backoff-polly I couldn't find anything about the need to drain responses or to clone requests. Checked also https:/App-vNext/Polly/blob/7903c9578f6dc14241c747818cf4c9e1c62aa8d1/src/Polly and I could not find anything similar.

I totally appreciate that the PR is a step forward, however I don't think I have all the info to sign off, e.g. I don't understand why it works, if it works, and why for example a popular library like Polly doesn't do all of this (or does it?)

jansenbe · 2023-03-14T06:59:11Z

@dluc : I've copied the drain code from a previous implementation and we've been using it in open source projects for years and no issues with it. Think this has to do with code that uses Streams (e.g. to upload a file), based on below thread this is also a concern when using Polly: https://stackoverflow.com/questions/74373069/polly-retry-request-with-streamcontent-and-memorystream-no-content-on-second

dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs

This commit introduces several changes to improve the reliability and usability of the HTTP requests in the SemanticKernel. It adds a CancellationToken parameter to the CompleteAsync method of ITextCompletionClient and its implementations, allowing callers to cancel the completion request. It also adds an optional IDelegatingHandlerFactory parameter to the constructors of OpenAI client classes and memory configuration classes, enabling the injection of custom retry logic for HTTP requests. The default retry logic is provided by the DefaultHttpRetryHandlerFactory class, which uses a Polly policy to handle transient errors and respects the RetryAfter header when present. The commit also adds a new option to the kernel configuration and the kernel builder to specify a custom retry handler factory for the backends that use HTTP requests. The commit removes the unused IRetryMechanism interface and the PassThroughWithoutRetry class, and replaces them with the NullHttpRetryHandler and NullHttpRetryHandlerFactory, which do not retry on failure. The commit also makes some minor code style improvements and fixes a typo in a comment. The commit updates the documentation comments to reflect the new parameters and features.

This commit adds and improves unit tests for the classes that handle HTTP retry logic for OpenAI requests. It also renames and removes some deprecated classes and adds a dependency on Moq, a testing framework that allows for mocking interfaces. The main changes are: - Rename PassThroughWithoutRetry to NullHttpRetryHandler and remove PassThroughWithoutRetryTests - Add unit tests for NullHttpRetryHandler and DefaultHttpRetryHandler, which implement different retry policies based on status codes, exceptions, and backoff strategies - Add the DynamicProxyGenAssembly2 assembly attribute to SemanticKernel.csproj, which is required for using Moq - Add a new feature to the kernel configuration that allows setting a custom HTTP retry policy for OpenAI requests - Add an integration test that verifies the retry behavior when using an invalid OpenAI API key - Refactor the existing kernel configuration tests and the RedirectOutput class to improve readability and test coverag

…levels Summary: This commit introduces several changes to improve the kernel's performance and usability when using the OpenAI completion backend: - It refactors the retry mechanism for kernel requests by using a delegating handler factory instead of a custom interface. This allows for more flexibility and consistency in the retry logic, and simplifies the configuration of semantic skills. - It adds a new retry handler that uses the RetryAfter value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. - It adds a new file Example08_RetryHandler.cs that demonstrates how to use different retry policies for semantic skills, and removes some unused or redundant code from other examples. - It changes the logging levels for the RepoUtils project, which provides utilities for working with GitHub repositories. It comments out the line that sets the minimum logging level to Warning, and adds a filter to only log Warning messages from the System namespace, to reduce noise from irrelevant messages.

Summary: This commit relocates the HttpRetryConfig class, which defines the retry policy for HTTP requests, from the Configuration namespace to the Reliability namespace. This change improves the code organization and cohesion, as the class is more logically related to the DefaultHttpRetryHandler and its factory, which implement the retry logic. The commit also removes some unused code and simplifies some exception messages. The references and tests for the class are updated accordingly.

Summary: This commit makes several improvements to the DefaultHttpRetryHandler class, which handles retrying HTTP requests in case of failures. The main changes are: - Add more comments to explain the logic and parameters of the retry methods. - Refactor the catch block to avoid re-throwing the exception if the max retry count is reached or there is no time left for a retry. - Use the current time from the ITimeProvider interface instead of DateTimeOffset.Now, to enable unit testing. - Make the CloneAsync method private, as it is only used internally by the class. - Add null check for the exception parameter in the ShouldRetry method.

Summary: This commit adds a new unit test for the DefaultHttpRetryHandler class, which verifies that the retry logic respects the max total delay configuration when encountering exceptions. The test uses mock time and delay providers to simulate different scenarios and assert the expected behavior. The commit also removes some redundant comments from other tests.

Summary: This commit fixes a bug where the HTTP handler factory set in the KernelConfig was not used by the KernelBuilder, and instead a default one was always created. It also makes the DefaultHttpRetryHandlerFactory constructor public, and adds some documentation for the KernelConfig properties.

Summary: This commit adds the time spent on the request to the error logging messages in DefaultHttpRetryHandler, when the maximum total retry time is reached. This helps to diagnose the cause of the error and the performance of the retry policy. The commit also updates the unit tests to verify the expected number of calls to the time provider mock.

Summary: This commit renames the SemanticKernel.Test project to SemanticKernel.UnitTests, to follow the naming convention of other test projects in the solution. It also updates the references to the project in the .vscode/tasks.json file, to ensure that the test tasks work correctly. This change does not affect the functionality or behavior of the tests, only the project name.

…crosoft#58) ### Motivation and Context This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems: 1. The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic. 2. The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests. ### Description #### Kernel - Added a new abstraction for creating HTTP retry handlers, the `IDelegatingHandlerFactory` interface, which allows using the built-in `HttpClient` retry logic and injecting custom retry policies. - Added a new class `DefaultHttpRetryHandlerFactory` that implements a configurable retry policy based on the `KernelConfig.HttpRetryConfig` settings. This factory is used by default when creating a kernel, unless a different factory is specified. - Added a new class `NullHttpRetryHandlerFactory` that creates a no-op retry handler, useful for testing or disabling retries. - Added support for cancellation tokens and retry handlers in the OpenAI services classes, such as `AzureTextCompletion`, `AzureTextEmbeddings`, and `OpenAITextCompletion`. This allows the callers to cancel the requests and handle transient failures more gracefully. - Added unit tests for the `DefaultHttpRetryHandler` class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters. - Removed the unused `IRetryMechanism` and `PassThroughWithoutRetry` classes, and the `Example08_RetryMechanism.cs` file. - Refactored the `KernelConfig` class to expose the `HttpRetryConfig` and the `HttpHandlerFactory` properties. - Refactored the `Kernel` class to use the `HttpHandlerFactory` instead of the `RetryMechanism` for invoking the pipeline functions. - Updated the `KernelBuilder` sample to show how to use a custom retry handler factory. - Updated the documentation and the code style accordingly. #### Kernel-Syntax-Examples This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class, `RetryThreeTimesWithBackoff`, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class, `RetryThreeTimesWithRetryAfterBackoff`, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface. The pull request also modifies the `ConsoleLogger` class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes. This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:

### Motivation and Context  removes unused packages/commands from our `package.json`. ### Description  - removes packages not imported into the app - removes the `depcheck` command as this is a tool that should be run globally and not included in the project. we may want to look into having a job run this command periodically to clean up the packages, but we shouldn't be adding packages that aren't used anyway. - removes the `packaage-lock.json` accidentally added in microsoft#55 ### Contribution Checklist  - [x] The code builds clean without any errors or warnings - [x] The PR follows the [Contribution Guidelines](https:/microsoft/copilot-chat/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https:/microsoft/copilot-chat/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄

) ### Motivation and Context  #6030 ### Description ~In this implementation, the `Ġ` will be reserved in liquid template which is used to replace `:` in all input variables when unsafe content is not allowed.~ ~The encoding process for input variables when unsafe content is not allowed is~ ~- replace all `:` to `Ġ` // this is the extra step comparing with HandlerBar Template~ ~- Encode xml using `HttpUtility.HtmlEncode`~ ~The decoding process is~ ~- replace all `Ġ` to `:`~ This PR introduces a new process to mitigate potential prompt injection attacks from input variables when using liquid templates. Here's a breakdown of the steps: Before rendering, each input variable undergoes a transformation: all occurrences of `:`are replaced with `:`. This ensures that message tags like `system:`, `user:`, or `assistant:` are not present if `AllowUnsafeContent` is set to `false`. No replacement occurs if `AllowUnsafeContent` is `true`. After rendering, each message content is processed based on the `AllowUnsafeContent` setting. If it's `false`, all `:` instances are reverted back to `:`, followed by calling `html_encode` on each message content. If `AllowUnsafeContent` is `true`, only `html_encode` is called. This additional encoding step is necessary because `ChatPromptParser` always decodes XML message content, requiring the liquid template to undergo an extra encoding step to ensure the rendered content matches the original before rendering.  ### Contribution Checklist  - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄

…crosoft#58) ### Motivation and Context This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems: 1. The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic. 2. The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests. ### Description #### Kernel - Added a new abstraction for creating HTTP retry handlers, the `IDelegatingHandlerFactory` interface, which allows using the built-in `HttpClient` retry logic and injecting custom retry policies. - Added a new class `DefaultHttpRetryHandlerFactory` that implements a configurable retry policy based on the `KernelConfig.HttpRetryConfig` settings. This factory is used by default when creating a kernel, unless a different factory is specified. - Added a new class `NullHttpRetryHandlerFactory` that creates a no-op retry handler, useful for testing or disabling retries. - Added support for cancellation tokens and retry handlers in the OpenAI services classes, such as `AzureTextCompletion`, `AzureTextEmbeddings`, and `OpenAITextCompletion`. This allows the callers to cancel the requests and handle transient failures more gracefully. - Added unit tests for the `DefaultHttpRetryHandler` class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters. - Removed the unused `IRetryMechanism` and `PassThroughWithoutRetry` classes, and the `Example08_RetryMechanism.cs` file. - Refactored the `KernelConfig` class to expose the `HttpRetryConfig` and the `HttpHandlerFactory` properties. - Refactored the `Kernel` class to use the `HttpHandlerFactory` instead of the `RetryMechanism` for invoking the pipeline functions. - Updated the `KernelBuilder` sample to show how to use a custom retry handler factory. - Updated the documentation and the code style accordingly. #### Kernel-Syntax-Examples This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class, `RetryThreeTimesWithBackoff`, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class, `RetryThreeTimesWithRetryAfterBackoff`, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface. The pull request also modifies the `ConsoleLogger` class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes. This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:

…openapi-core-0.19.1 Bump openapi-core from 0.18.2 to 0.19.1 in /python

lemillermicrosoft changed the title ~~U/lemiller/http retry handler~~ Add HTTP retry logic and cancellation support for OpenAI services Mar 10, 2023

lemillermicrosoft requested review from dluc and shawncal March 10, 2023 22:35

lemillermicrosoft mentioned this pull request Mar 10, 2023

Integrated retry-handler to not having to rely on Polly for throttling handling as everyone will need it #17

Closed

lemillermicrosoft added the PR: ready for review All feedback addressed, ready for reviews label Mar 10, 2023

lemillermicrosoft commented Mar 10, 2023

View reviewed changes

dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs Outdated Show resolved Hide resolved

lemillermicrosoft mentioned this pull request Mar 11, 2023

DelegateHandler instead of Retry Interface lemillermicrosoft/semantic-kernel#2

Closed

lemillermicrosoft force-pushed the u/lemiller/http_retry_handler branch from 213a4c1 to ddbe32b Compare March 11, 2023 00:43

awharrison-28 reviewed Mar 11, 2023

View reviewed changes

dluc force-pushed the u/lemiller/http_retry_handler branch from 141fd90 to e4aaaab Compare March 11, 2023 21:50

dluc requested changes Mar 11, 2023

View reviewed changes

dluc added PR: feedback to address Waiting for PR owner to address comments/questions and removed PR: ready for review All feedback addressed, ready for reviews labels Mar 11, 2023

lemillermicrosoft force-pushed the u/lemiller/http_retry_handler branch from e4aaaab to 5a58c4a Compare March 13, 2023 16:33

lemillermicrosoft added PR: ready for review All feedback addressed, ready for reviews and removed PR: feedback to address Waiting for PR owner to address comments/questions labels Mar 13, 2023

lemillermicrosoft requested review from dluc and awharrison-28 March 13, 2023 18:33

lemillermicrosoft force-pushed the u/lemiller/http_retry_handler branch 2 times, most recently from 13020d4 to f797536 Compare March 13, 2023 21:50

shawncal previously approved these changes Mar 13, 2023

View reviewed changes

shawncal assigned dluc Mar 13, 2023

dluc added the PR: feedback to address Waiting for PR owner to address comments/questions label Mar 14, 2023

lemillermicrosoft dismissed shawncal’s stale review via 8f837fa March 14, 2023 02:29

lemillermicrosoft force-pushed the u/lemiller/http_retry_handler branch from f21b072 to 8f837fa Compare March 14, 2023 02:29

lemillermicrosoft removed the PR: feedback to address Waiting for PR owner to address comments/questions label Mar 14, 2023

lemillermicrosoft commented Mar 14, 2023

View reviewed changes

dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs Show resolved Hide resolved

lemillermicrosoft added 22 commits March 14, 2023 14:25

Make DefaultHttpRetryHandler and DefaultHttpRetryHandlerFactory public

dae56cb

PR feedback on tests and samples

3d1ccf0

kernel syntax examples

3534768

r# cleanup first, competes with format

38a55d3

only drain response if we aren't returning the response as-is

d1aba65

Remove request cloning and content draining in DefaultHttpRetryHandler

eafebed

Fix rebase build error

7a58ae2

Refactor OpenAIClientAbstract constructor to accept handler factory

23c5b39

format method signature

5c74caa

xmldocs

cb48216

wrap signatures

989c786

formatting

0547680

fix time formatting in logs

cfb83b3

lemillermicrosoft force-pushed the u/lemiller/http_retry_handler branch from a37f188 to cfb83b3 Compare March 14, 2023 21:25

dluc approved these changes Mar 14, 2023

View reviewed changes

dluc merged commit b766b75 into microsoft:main Mar 14, 2023

Bryan-Roe added a commit to BMR-Cloud-Dev/semantic that referenced this pull request Jun 28, 2024

Merge pull request microsoft#58 from Bryan-Roe/dependabot/pip/python/…

a5907ec

…openapi-core-0.19.1 Bump openapi-core from 0.18.2 to 0.19.1 in /python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HTTP retry logic and cancellation support for OpenAI services #58

Add HTTP retry logic and cancellation support for OpenAI services #58

lemillermicrosoft commented Mar 10, 2023

dluc left a comment

jansenbe commented Mar 13, 2023

dluc commented Mar 14, 2023

jansenbe commented Mar 14, 2023

Add HTTP retry logic and cancellation support for OpenAI services #58

Add HTTP retry logic and cancellation support for OpenAI services #58

Conversation

lemillermicrosoft commented Mar 10, 2023

Motivation and Context

Description

Kernel

Kernel-Syntax-Examples

Contribution Checklist

dluc left a comment

Choose a reason for hiding this comment

jansenbe commented Mar 13, 2023

dluc commented Mar 14, 2023

jansenbe commented Mar 14, 2023