Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTTP retry logic and cancellation support for OpenAI services #58

Merged
merged 22 commits into from
Mar 14, 2023

Conversation

lemillermicrosoft
Copy link
Member

Motivation and Context

This set of code changes is required to improve the performance and usability of the OpenAI completion backend by refactoring the kernel retry mechanism and adding a new http retry delegating handler. Specifically, it addresses the following problems:

  1. The existing retry mechanism for kernel requests was using a custom interface that made it difficult to customize and maintain the retry logic. By refactoring it to use a delegating handler factory, we can simplify the configuration of semantic skills and improve the consistency and flexibility of the retry logic.

  2. The existing retry handler did not take advantage of the Retry-After value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests. By adding a new handler that does this, we can reduce the number of unnecessary retries and improve the response times for the requests.

Description

Kernel

  • Added a new abstraction for creating HTTP retry handlers, the IDelegatingHandlerFactory interface, which allows using the built-in HttpClient retry logic and injecting custom retry policies.
  • Added a new class DefaultHttpRetryHandlerFactory that implements a configurable retry policy based on the KernelConfig.HttpRetryConfig settings. This factory is used by default when creating a kernel, unless a different factory is specified.
  • Added a new class NullHttpRetryHandlerFactory that creates a no-op retry handler, useful for testing or disabling retries.
  • Added support for cancellation tokens and retry handlers in the OpenAI services classes, such as AzureTextCompletion, AzureTextEmbeddings, and OpenAITextCompletion. This allows the callers to cancel the requests and handle transient failures more gracefully.
  • Added unit tests for the DefaultHttpRetryHandler class, which is responsible for handling HTTP retries. The tests cover various scenarios such as retrying on different status codes, retrying on different exception types, and respecting the retry configuration parameters.
  • Removed the unused IRetryMechanism and PassThroughWithoutRetry classes, and the Example08_RetryMechanism.cs file.
  • Refactored the KernelConfig class to expose the HttpRetryConfig and the HttpHandlerFactory properties.
  • Refactored the Kernel class to use the HttpHandlerFactory instead of the RetryMechanism for invoking the pipeline functions.
  • Updated the KernelBuilder sample to show how to use a custom retry handler factory.
  • Updated the documentation and the code style accordingly.

Kernel-Syntax-Examples

This pull request adds two new classes to the Reliability namespace that implement different retry policies for HTTP requests. The first class, RetryThreeTimesWithBackoff, retries a request three times with an exponential backoff if it encounters a 429 (Too Many Requests) or 401 (Unauthorized) status code. The second class, RetryThreeTimesWithRetryAfterBackoff, also retries three times, but uses the value of the Retry-After header to determine the backoff duration. Both classes use Polly to implement the retry logic and log the outcome of each attempt using the ILogger interface.

The pull request also modifies the ConsoleLogger class to filter out log messages from the System namespace with a level lower than Warning. This is to reduce the noise from the HttpClient and DelegatingHandler classes.
This pull request introduces several improvements and features related to the HTTP retry logic and the OpenAI services. The main changes are:

Contribution Checklist

@lemillermicrosoft lemillermicrosoft changed the title U/lemiller/http retry handler Add HTTP retry logic and cancellation support for OpenAI services Mar 10, 2023
@lemillermicrosoft lemillermicrosoft added the PR: ready for review All feedback addressed, ready for reviews label Mar 10, 2023
@dluc dluc force-pushed the u/lemiller/http_retry_handler branch from 141fd90 to e4aaaab Compare March 11, 2023 21:50
Copy link
Collaborator

@dluc dluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of code and tests, the code needs some more comments. Some minor improvements before merging.
FYI there's a plan to adopt Azure OpenAI SDK, and I hope we can still leverage this work, maybe we should hand it over.
Overall this is pretty good, and a lot of code, I think this is as far as we should go, otherwise we should just take a dependency on something off the shelf.

@dluc dluc added PR: feedback to address Waiting for PR owner to address comments/questions and removed PR: ready for review All feedback addressed, ready for reviews labels Mar 11, 2023
@lemillermicrosoft lemillermicrosoft added PR: ready for review All feedback addressed, ready for reviews and removed PR: feedback to address Waiting for PR owner to address comments/questions labels Mar 13, 2023
@jansenbe
Copy link
Member

@lemillermicrosoft : good to see some my original ideas and code pieces being reused :-) Your implementation however is much more cleaner and flexible, looks really good!!

@lemillermicrosoft lemillermicrosoft force-pushed the u/lemiller/http_retry_handler branch 2 times, most recently from 13020d4 to f797536 Compare March 13, 2023 21:50
shawncal
shawncal previously approved these changes Mar 13, 2023
@dluc
Copy link
Collaborator

dluc commented Mar 14, 2023

About response draining and the special code to clone requests: looking at https://learn.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-http-call-retries-exponential-backoff-polly I couldn't find anything about the need to drain responses or to clone requests. Checked also https:/App-vNext/Polly/blob/7903c9578f6dc14241c747818cf4c9e1c62aa8d1/src/Polly and I could not find anything similar.

I totally appreciate that the PR is a step forward, however I don't think I have all the info to sign off, e.g. I don't understand why it works, if it works, and why for example a popular library like Polly doesn't do all of this (or does it?)

@dluc dluc added the PR: feedback to address Waiting for PR owner to address comments/questions label Mar 14, 2023
@lemillermicrosoft lemillermicrosoft removed the PR: feedback to address Waiting for PR owner to address comments/questions label Mar 14, 2023
@jansenbe
Copy link
Member

@dluc : I've copied the drain code from a previous implementation and we've been using it in open source projects for years and no issues with it. Think this has to do with code that uses Streams (e.g. to upload a file), based on below thread this is also a concern when using Polly: https://stackoverflow.com/questions/74373069/polly-retry-request-with-streamcontent-and-memorystream-no-content-on-second

This commit introduces several changes to improve the reliability and usability of the HTTP requests in the SemanticKernel. It adds a CancellationToken parameter to the CompleteAsync method of ITextCompletionClient and its implementations, allowing callers to cancel the completion request. It also adds an optional IDelegatingHandlerFactory parameter to the constructors of OpenAI client classes and memory configuration classes, enabling the injection of custom retry logic for HTTP requests. The default retry logic is provided by the DefaultHttpRetryHandlerFactory class, which uses a Polly policy to handle transient errors and respects the RetryAfter header when present. The commit also adds a new option to the kernel configuration and the kernel builder to specify a custom retry handler factory for the backends that use HTTP requests. The commit removes the unused IRetryMechanism interface and the PassThroughWithoutRetry class, and replaces them with the NullHttpRetryHandler and NullHttpRetryHandlerFactory, which do not retry on failure. The commit also makes some minor code style improvements and fixes a typo in a comment. The commit updates the documentation comments to reflect the new parameters and features.
This commit adds and improves unit tests for the classes that handle HTTP retry logic for OpenAI requests. It also renames and removes some deprecated classes and adds a dependency on Moq, a testing framework that allows for mocking interfaces. The main changes are:

- Rename PassThroughWithoutRetry to NullHttpRetryHandler and remove PassThroughWithoutRetryTests
- Add unit tests for NullHttpRetryHandler and DefaultHttpRetryHandler, which implement different retry policies based on status codes, exceptions, and backoff strategies
- Add the DynamicProxyGenAssembly2 assembly attribute to SemanticKernel.csproj, which is required for using Moq
- Add a new feature to the kernel configuration that allows setting a custom HTTP retry policy for OpenAI requests
- Add an integration test that verifies the retry behavior when using an invalid OpenAI API key
- Refactor the existing kernel configuration tests and the RedirectOutput class to improve readability and test coverag
…levels

Summary: This commit introduces several changes to improve the kernel's performance and usability when using the OpenAI completion backend:

- It refactors the retry mechanism for kernel requests by using a delegating handler factory instead of a custom interface. This allows for more flexibility and consistency in the retry logic, and simplifies the configuration of semantic skills.
- It adds a new retry handler that uses the RetryAfter value from the response headers to determine the backoff duration, which can improve the efficiency and reliability of the requests.
- It adds a new file Example08_RetryHandler.cs that demonstrates how to use different retry policies for semantic skills, and removes some unused or redundant code from other examples.
- It changes the logging levels for the RepoUtils project, which provides utilities for working with GitHub repositories. It comments out the line that sets the minimum logging level to Warning, and adds a filter to only log Warning messages from the System namespace, to reduce noise from irrelevant messages.
Summary: This commit relocates the HttpRetryConfig class, which defines the retry policy for HTTP requests, from the Configuration namespace to the Reliability namespace. This change improves the code organization and cohesion, as the class is more logically related to the DefaultHttpRetryHandler and its factory, which implement the retry logic. The commit also removes some unused code and simplifies some exception messages. The references and tests for the class are updated accordingly.
Summary: This commit makes several improvements to the DefaultHttpRetryHandler class, which handles retrying HTTP requests in case of failures. The main changes are:

- Add more comments to explain the logic and parameters of the retry methods.
- Refactor the catch block to avoid re-throwing the exception if the max retry count is reached or there is no time left for a retry.
- Use the current time from the ITimeProvider interface instead of DateTimeOffset.Now, to enable unit testing.
- Make the CloneAsync method private, as it is only used internally by the class.
- Add null check for the exception parameter in the ShouldRetry method.
Summary: This commit adds a new unit test for the DefaultHttpRetryHandler class, which verifies that the retry logic respects the max total delay configuration when encountering exceptions. The test uses mock time and delay providers to simulate different scenarios and assert the expected behavior. The commit also removes some redundant comments from other tests.
      Summary: This commit fixes a bug where the HTTP handler factory set in the KernelConfig was not used by the KernelBuilder, and instead a default one was always created. It also makes the DefaultHttpRetryHandlerFactory constructor public, and adds some documentation for the KernelConfig properties.
      Summary: This commit adds the time spent on the request to the error logging messages in DefaultHttpRetryHandler, when the maximum total retry time is reached. This helps to diagnose the cause of the error and the performance of the retry policy. The commit also updates the unit tests to verify the expected number of calls to the time provider mock.
      Summary: This commit renames the SemanticKernel.Test project to SemanticKernel.UnitTests, to follow the naming convention of other test projects in the solution. It also updates the references to the project in the .vscode/tasks.json file, to ensure that the test tasks work correctly. This change does not affect the functionality or behavior of the tests, only the project name.
@dluc dluc merged commit b766b75 into microsoft:main Mar 14, 2023
dehoward pushed a commit to lemillermicrosoft/semantic-kernel that referenced this pull request Jun 1, 2023
…crosoft#58)

### Motivation and Context

This set of code changes is required to improve the performance and
usability of the OpenAI completion backend by refactoring the kernel
retry mechanism and adding a new http retry delegating handler.
Specifically, it addresses the following problems:

1. The existing retry mechanism for kernel requests was using a custom
interface that made it difficult to customize and maintain the retry
logic. By refactoring it to use a delegating handler factory, we can
simplify the configuration of semantic skills and improve the
consistency and flexibility of the retry logic.

2. The existing retry handler did not take advantage of the Retry-After
value from the response headers to determine the backoff duration, which
can improve the efficiency and reliability of the requests. By adding a
new handler that does this, we can reduce the number of unnecessary
retries and improve the response times for the requests.

### Description

#### Kernel 
- Added a new abstraction for creating HTTP retry handlers, the
`IDelegatingHandlerFactory` interface, which allows using the built-in
`HttpClient` retry logic and injecting custom retry policies.
- Added a new class `DefaultHttpRetryHandlerFactory` that implements a
configurable retry policy based on the `KernelConfig.HttpRetryConfig`
settings. This factory is used by default when creating a kernel, unless
a different factory is specified.
- Added a new class `NullHttpRetryHandlerFactory` that creates a no-op
retry handler, useful for testing or disabling retries.
- Added support for cancellation tokens and retry handlers in the OpenAI
services classes, such as `AzureTextCompletion`, `AzureTextEmbeddings`,
and `OpenAITextCompletion`. This allows the callers to cancel the
requests and handle transient failures more gracefully.
- Added unit tests for the `DefaultHttpRetryHandler` class, which is
responsible for handling HTTP retries. The tests cover various scenarios
such as retrying on different status codes, retrying on different
exception types, and respecting the retry configuration parameters.
- Removed the unused `IRetryMechanism` and `PassThroughWithoutRetry`
classes, and the `Example08_RetryMechanism.cs` file.
- Refactored the `KernelConfig` class to expose the `HttpRetryConfig`
and the `HttpHandlerFactory` properties.
- Refactored the `Kernel` class to use the `HttpHandlerFactory` instead
of the `RetryMechanism` for invoking the pipeline functions.
- Updated the `KernelBuilder` sample to show how to use a custom retry
handler factory.
- Updated the documentation and the code style accordingly.

#### Kernel-Syntax-Examples
This pull request adds two new classes to the Reliability namespace that
implement different retry policies for HTTP requests. The first class,
`RetryThreeTimesWithBackoff`, retries a request three times with an
exponential backoff if it encounters a 429 (Too Many Requests) or 401
(Unauthorized) status code. The second class,
`RetryThreeTimesWithRetryAfterBackoff`, also retries three times, but
uses the value of the Retry-After header to determine the backoff
duration. Both classes use Polly to implement the retry logic and log
the outcome of each attempt using the ILogger interface.

The pull request also modifies the `ConsoleLogger` class to filter out
log messages from the System namespace with a level lower than Warning.
This is to reduce the noise from the HttpClient and DelegatingHandler
classes.
This pull request introduces several improvements and features related
to the HTTP retry logic and the OpenAI services. The main changes are:
golden-aries pushed a commit to golden-aries/semantic-kernel that referenced this pull request Oct 10, 2023
### Motivation and Context

<!-- Thank you for your contribution to the copilot-chat repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
removes unused packages/commands from our `package.json`. 

### Description

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
- removes packages not imported into the app
- removes the `depcheck` command as this is a tool that should be run
globally and not included in the project.
we may want to look into having a job run this command periodically to
clean up the packages, but we shouldn't be adding packages that aren't
used anyway.
- removes the `packaage-lock.json` accidentally added in microsoft#55

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [Contribution
Guidelines](https:/microsoft/copilot-chat/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https:/microsoft/copilot-chat/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄
dmytrostruk pushed a commit that referenced this pull request May 7, 2024
)

### Motivation and Context

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
#6030 
### Description
~In this implementation, the `Ġ` will be reserved in liquid template
which is used to replace `:` in all input variables when unsafe content
is not allowed.~

~The encoding process for input variables when unsafe content is not
allowed is~
~- replace all `:` to `Ġ` // this is the extra step comparing with
HandlerBar Template~
~- Encode xml using `HttpUtility.HtmlEncode`~

~The decoding process is~
~- replace all `Ġ` to `:`~


This PR introduces a new process to mitigate potential prompt injection
attacks from input variables when using liquid templates. Here's a
breakdown of the steps:

Before rendering, each input variable undergoes a transformation: all
occurrences of `:`are replaced with `&#58;`. This ensures that message
tags like `system:`, `user:`, or `assistant:` are not present if
`AllowUnsafeContent` is set to `false`. No replacement occurs if
`AllowUnsafeContent` is `true`.
After rendering, each message content is processed based on the
`AllowUnsafeContent` setting. If it's `false`, all `&#58;` instances are
reverted back to `:`, followed by calling `html_encode` on each message
content. If `AllowUnsafeContent` is `true`, only `html_encode` is
called. This additional encoding step is necessary because
`ChatPromptParser` always decodes XML message content, requiring the
liquid template to undergo an extra encoding step to ensure the rendered
content matches the original before rendering.

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https:/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄
johnoliver pushed a commit to johnoliver/semantic-kernel that referenced this pull request Jun 5, 2024
…crosoft#58)

### Motivation and Context

This set of code changes is required to improve the performance and
usability of the OpenAI completion backend by refactoring the kernel
retry mechanism and adding a new http retry delegating handler.
Specifically, it addresses the following problems:

1. The existing retry mechanism for kernel requests was using a custom
interface that made it difficult to customize and maintain the retry
logic. By refactoring it to use a delegating handler factory, we can
simplify the configuration of semantic skills and improve the
consistency and flexibility of the retry logic.

2. The existing retry handler did not take advantage of the Retry-After
value from the response headers to determine the backoff duration, which
can improve the efficiency and reliability of the requests. By adding a
new handler that does this, we can reduce the number of unnecessary
retries and improve the response times for the requests.

### Description

#### Kernel 
- Added a new abstraction for creating HTTP retry handlers, the
`IDelegatingHandlerFactory` interface, which allows using the built-in
`HttpClient` retry logic and injecting custom retry policies.
- Added a new class `DefaultHttpRetryHandlerFactory` that implements a
configurable retry policy based on the `KernelConfig.HttpRetryConfig`
settings. This factory is used by default when creating a kernel, unless
a different factory is specified.
- Added a new class `NullHttpRetryHandlerFactory` that creates a no-op
retry handler, useful for testing or disabling retries.
- Added support for cancellation tokens and retry handlers in the OpenAI
services classes, such as `AzureTextCompletion`, `AzureTextEmbeddings`,
and `OpenAITextCompletion`. This allows the callers to cancel the
requests and handle transient failures more gracefully.
- Added unit tests for the `DefaultHttpRetryHandler` class, which is
responsible for handling HTTP retries. The tests cover various scenarios
such as retrying on different status codes, retrying on different
exception types, and respecting the retry configuration parameters.
- Removed the unused `IRetryMechanism` and `PassThroughWithoutRetry`
classes, and the `Example08_RetryMechanism.cs` file.
- Refactored the `KernelConfig` class to expose the `HttpRetryConfig`
and the `HttpHandlerFactory` properties.
- Refactored the `Kernel` class to use the `HttpHandlerFactory` instead
of the `RetryMechanism` for invoking the pipeline functions.
- Updated the `KernelBuilder` sample to show how to use a custom retry
handler factory.
- Updated the documentation and the code style accordingly.

#### Kernel-Syntax-Examples
This pull request adds two new classes to the Reliability namespace that
implement different retry policies for HTTP requests. The first class,
`RetryThreeTimesWithBackoff`, retries a request three times with an
exponential backoff if it encounters a 429 (Too Many Requests) or 401
(Unauthorized) status code. The second class,
`RetryThreeTimesWithRetryAfterBackoff`, also retries three times, but
uses the value of the Retry-After header to determine the backoff
duration. Both classes use Polly to implement the retry logic and log
the outcome of each attempt using the ILogger interface.

The pull request also modifies the `ConsoleLogger` class to filter out
log messages from the System namespace with a level lower than Warning.
This is to reduce the noise from the HttpClient and DelegatingHandler
classes.
This pull request introduces several improvements and features related
to the HTTP retry logic and the OpenAI services. The main changes are:
Bryan-Roe added a commit to BMR-Cloud-Dev/semantic that referenced this pull request Jun 28, 2024
…openapi-core-0.19.1

Bump openapi-core from 0.18.2 to 0.19.1 in /python
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: ready for review All feedback addressed, ready for reviews PR: ready to merge PR has been approved by all reviewers, and is ready to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants