-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HTTP retry logic and cancellation support for OpenAI services #58
Add HTTP retry logic and cancellation support for OpenAI services #58
Conversation
dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs
Outdated
Show resolved
Hide resolved
213a4c1
to
ddbe32b
Compare
141fd90
to
e4aaaab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of code and tests, the code needs some more comments. Some minor improvements before merging.
FYI there's a plan to adopt Azure OpenAI SDK, and I hope we can still leverage this work, maybe we should hand it over.
Overall this is pretty good, and a lot of code, I think this is as far as we should go, otherwise we should just take a dependency on something off the shelf.
dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs
Outdated
Show resolved
Hide resolved
dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs
Outdated
Show resolved
Hide resolved
dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs
Outdated
Show resolved
Hide resolved
dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs
Outdated
Show resolved
Hide resolved
dotnet/src/SemanticKernel/Reliability/DefaultHttpRetryHandler.cs
Outdated
Show resolved
Hide resolved
e4aaaab
to
5a58c4a
Compare
@lemillermicrosoft : good to see some my original ideas and code pieces being reused :-) Your implementation however is much more cleaner and flexible, looks really good!! |
13020d4
to
f797536
Compare
About response draining and the special code to clone requests: looking at https://learn.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-http-call-retries-exponential-backoff-polly I couldn't find anything about the need to drain responses or to clone requests. Checked also https:/App-vNext/Polly/blob/7903c9578f6dc14241c747818cf4c9e1c62aa8d1/src/Polly and I could not find anything similar. I totally appreciate that the PR is a step forward, however I don't think I have all the info to sign off, e.g. I don't understand why it works, if it works, and why for example a popular library like Polly doesn't do all of this (or does it?) |
f21b072
to
8f837fa
Compare
@dluc : I've copied the drain code from a previous implementation and we've been using it in open source projects for years and no issues with it. Think this has to do with code that uses Streams (e.g. to upload a file), based on below thread this is also a concern when using Polly: https://stackoverflow.com/questions/74373069/polly-retry-request-with-streamcontent-and-memorystream-no-content-on-second |
This commit introduces several changes to improve the reliability and usability of the HTTP requests in the SemanticKernel. It adds a CancellationToken parameter to the CompleteAsync method of ITextCompletionClient and its implementations, allowing callers to cancel the completion request. It also adds an optional IDelegatingHandlerFactory parameter to the constructors of OpenAI client classes and memory configuration classes, enabling the injection of custom retry logic for HTTP requests. The default retry logic is provided by the DefaultHttpRetryHandlerFactory class, which uses a Polly policy to handle transient errors and respects the RetryAfter header when present. The commit also adds a new option to the kernel configuration and the kernel builder to specify a custom retry handler factory for the backends that use HTTP requests. The commit removes the unused IRetryMechanism interface and the PassThroughWithoutRetry class, and replaces them with the NullHttpRetryHandler and NullHttpRetryHandlerFactory, which do not retry on failure. The commit also makes some minor code style improvements and fixes a typo in a comment. The commit updates the documentation comments to reflect the new parameters and features.
This commit adds and improves unit tests for the classes that handle HTTP retry logic for OpenAI requests. It also renames and removes some deprecated classes and adds a dependency on Moq, a testing framework that allows for mocking interfaces. The main changes are: - Rename PassThroughWithoutRetry to NullHttpRetryHandler and remove PassThroughWithoutRetryTests - Add unit tests for NullHttpRetryHandler and DefaultHttpRetryHandler, which implement different retry policies based on status codes, exceptions, and backoff strategies - Add the DynamicProxyGenAssembly2 assembly attribute to SemanticKernel.csproj, which is required for using Moq - Add a new feature to the kernel configuration that allows setting a custom HTTP retry policy for OpenAI requests - Add an integration test that verifies the retry behavior when using an invalid OpenAI API key - Refactor the existing kernel configuration tests and the RedirectOutput class to improve readability and test coverag
…levels Summary: This commit introduces several changes to improve the kernel's performance and usability when using the OpenAI completion backend: - It refactors the retry mechanism for kernel requests by using a delegating handler factory instead of a custom interface. This allows for more flexibility and consistency in the retry logic, and simplifies the configuration of semantic skills. - It adds a new retry handler that uses the RetryAfter value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. - It adds a new file Example08_RetryHandler.cs that demonstrates how to use different retry policies for semantic skills, and removes some unused or redundant code from other examples. - It changes the logging levels for the RepoUtils project, which provides utilities for working with GitHub repositories. It comments out the line that sets the minimum logging level to Warning, and adds a filter to only log Warning messages from the System namespace, to reduce noise from irrelevant messages.
Summary: This commit relocates the HttpRetryConfig class, which defines the retry policy for HTTP requests, from the Configuration namespace to the Reliability namespace. This change improves the code organization and cohesion, as the class is more logically related to the DefaultHttpRetryHandler and its factory, which implement the retry logic. The commit also removes some unused code and simplifies some exception messages. The references and tests for the class are updated accordingly.
Summary: This commit makes several improvements to the DefaultHttpRetryHandler class, which handles retrying HTTP requests in case of failures. The main changes are: - Add more comments to explain the logic and parameters of the retry methods. - Refactor the catch block to avoid re-throwing the exception if the max retry count is reached or there is no time left for a retry. - Use the current time from the ITimeProvider interface instead of DateTimeOffset.Now, to enable unit testing. - Make the CloneAsync method private, as it is only used internally by the class. - Add null check for the exception parameter in the ShouldRetry method.
Summary: This commit adds a new unit test for the DefaultHttpRetryHandler class, which verifies that the retry logic respects the max total delay configuration when encountering exceptions. The test uses mock time and delay providers to simulate different scenarios and assert the expected behavior. The commit also removes some redundant comments from other tests.
Summary: This commit fixes a bug where the HTTP handler factory set in the KernelConfig was not used by the KernelBuilder, and instead a default one was always created. It also makes the DefaultHttpRetryHandlerFactory constructor public, and adds some documentation for the KernelConfig properties.
Summary: This commit adds the time spent on the request to the error logging messages in DefaultHttpRetryHandler, when the maximum total retry time is reached. This helps to diagnose the cause of the error and the performance of the retry policy. The commit also updates the unit tests to verify the expected number of calls to the time provider mock.
Summary: This commit renames the SemanticKernel.Test project to SemanticKernel.UnitTests, to follow the naming convention of other test projects in the solution. It also updates the references to the project in the .vscode/tasks.json file, to ensure that the test tasks work correctly. This change does not affect the functionality or behavior of the tests, only the project name.
a37f188
to
cfb83b3
Compare
…crosoft#58) ### Motivation and Context This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems: 1. The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic. 2. The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests. ### Description #### Kernel - Added a new abstraction for creating HTTP retry handlers, the `IDelegatingHandlerFactory` interface, which allows using the built-in `HttpClient` retry logic and injecting custom retry policies. - Added a new class `DefaultHttpRetryHandlerFactory` that implements a configurable retry policy based on the `KernelConfig.HttpRetryConfig` settings. This factory is used by default when creating a kernel, unless a different factory is specified. - Added a new class `NullHttpRetryHandlerFactory` that creates a no-op retry handler, useful for testing or disabling retries. - Added support for cancellation tokens and retry handlers in the OpenAI services classes, such as `AzureTextCompletion`, `AzureTextEmbeddings`, and `OpenAITextCompletion`. This allows the callers to cancel the requests and handle transient failures more gracefully. - Added unit tests for the `DefaultHttpRetryHandler` class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters. - Removed the unused `IRetryMechanism` and `PassThroughWithoutRetry` classes, and the `Example08_RetryMechanism.cs` file. - Refactored the `KernelConfig` class to expose the `HttpRetryConfig` and the `HttpHandlerFactory` properties. - Refactored the `Kernel` class to use the `HttpHandlerFactory` instead of the `RetryMechanism` for invoking the pipeline functions. - Updated the `KernelBuilder` sample to show how to use a custom retry handler factory. - Updated the documentation and the code style accordingly. #### Kernel-Syntax-Examples This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class, `RetryThreeTimesWithBackoff`, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class, `RetryThreeTimesWithRetryAfterBackoff`, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface. The pull request also modifies the `ConsoleLogger` class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes. This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:
### Motivation and Context <!-- Thank you for your contribution to the copilot-chat repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> removes unused packages/commands from our `package.json`. ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> - removes packages not imported into the app - removes the `depcheck` command as this is a tool that should be run globally and not included in the project. we may want to look into having a job run this command periodically to clean up the packages, but we shouldn't be adding packages that aren't used anyway. - removes the `packaage-lock.json` accidentally added in microsoft#55 ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [x] The code builds clean without any errors or warnings - [x] The PR follows the [Contribution Guidelines](https:/microsoft/copilot-chat/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https:/microsoft/copilot-chat/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄
) ### Motivation and Context <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> #6030 ### Description ~In this implementation, the `Ġ` will be reserved in liquid template which is used to replace `:` in all input variables when unsafe content is not allowed.~ ~The encoding process for input variables when unsafe content is not allowed is~ ~- replace all `:` to `Ġ` // this is the extra step comparing with HandlerBar Template~ ~- Encode xml using `HttpUtility.HtmlEncode`~ ~The decoding process is~ ~- replace all `Ġ` to `:`~ This PR introduces a new process to mitigate potential prompt injection attacks from input variables when using liquid templates. Here's a breakdown of the steps: Before rendering, each input variable undergoes a transformation: all occurrences of `:`are replaced with `:`. This ensures that message tags like `system:`, `user:`, or `assistant:` are not present if `AllowUnsafeContent` is set to `false`. No replacement occurs if `AllowUnsafeContent` is `true`. After rendering, each message content is processed based on the `AllowUnsafeContent` setting. If it's `false`, all `:` instances are reverted back to `:`, followed by calling `html_encode` on each message content. If `AllowUnsafeContent` is `true`, only `html_encode` is called. This additional encoding step is necessary because `ChatPromptParser` always decodes XML message content, requiring the liquid template to undergo an extra encoding step to ensure the rendered content matches the original before rendering. <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄
…crosoft#58) ### Motivation and Context This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems: 1. The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic. 2. The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests. ### Description #### Kernel - Added a new abstraction for creating HTTP retry handlers, the `IDelegatingHandlerFactory` interface, which allows using the built-in `HttpClient` retry logic and injecting custom retry policies. - Added a new class `DefaultHttpRetryHandlerFactory` that implements a configurable retry policy based on the `KernelConfig.HttpRetryConfig` settings. This factory is used by default when creating a kernel, unless a different factory is specified. - Added a new class `NullHttpRetryHandlerFactory` that creates a no-op retry handler, useful for testing or disabling retries. - Added support for cancellation tokens and retry handlers in the OpenAI services classes, such as `AzureTextCompletion`, `AzureTextEmbeddings`, and `OpenAITextCompletion`. This allows the callers to cancel the requests and handle transient failures more gracefully. - Added unit tests for the `DefaultHttpRetryHandler` class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters. - Removed the unused `IRetryMechanism` and `PassThroughWithoutRetry` classes, and the `Example08_RetryMechanism.cs` file. - Refactored the `KernelConfig` class to expose the `HttpRetryConfig` and the `HttpHandlerFactory` properties. - Refactored the `Kernel` class to use the `HttpHandlerFactory` instead of the `RetryMechanism` for invoking the pipeline functions. - Updated the `KernelBuilder` sample to show how to use a custom retry handler factory. - Updated the documentation and the code style accordingly. #### Kernel-Syntax-Examples This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class, `RetryThreeTimesWithBackoff`, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class, `RetryThreeTimesWithRetryAfterBackoff`, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface. The pull request also modifies the `ConsoleLogger` class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes. This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:
…openapi-core-0.19.1 Bump openapi-core from 0.18.2 to 0.19.1 in /python
Motivation and Context
This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems:
The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic.
The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests.
Description
Kernel
IDelegatingHandlerFactory
interface, which allows using the built-inHttpClient
retry logic and injecting custom retry policies.DefaultHttpRetryHandlerFactory
that implements a configurable retry policy based on theKernelConfig.HttpRetryConfig
settings. This factory is used by default when creating a kernel, unless a different factory is specified.NullHttpRetryHandlerFactory
that creates a no-op retry handler, useful for testing or disabling retries.AzureTextCompletion
,AzureTextEmbeddings
, andOpenAITextCompletion
. This allows the callers to cancel the requests and handle transient failures more gracefully.DefaultHttpRetryHandler
class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters.IRetryMechanism
andPassThroughWithoutRetry
classes, and theExample08_RetryMechanism.cs
file.KernelConfig
class to expose theHttpRetryConfig
and theHttpHandlerFactory
properties.Kernel
class to use theHttpHandlerFactory
instead of theRetryMechanism
for invoking the pipeline functions.KernelBuilder
sample to show how to use a custom retry handler factory.Kernel-Syntax-Examples
This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class,
RetryThreeTimesWithBackoff
, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class,RetryThreeTimesWithRetryAfterBackoff
, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface.The pull request also modifies the
ConsoleLogger
class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes.This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:
Contribution Checklist
dotnet format