Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Azure Cognitive Search's Vector Search #12

Merged
merged 1 commit into from
Jul 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions webapi/CopilotChatWebApi.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,14 @@

<ItemGroup>
<PackageReference Include="Microsoft.Azure.Cosmos" Version="3.35.2" />
<PackageReference Include="Microsoft.SemanticKernel" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.AI.OpenAI" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel" Version="0.17.230718.1-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.AI.OpenAI" Version="0.17.230718.1-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.Memory.Qdrant" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Skills.MsGraph" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Skills.OpenAPI" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Skills.Web" Version="0.17.230711.7-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.Memory.AzureSearch" Version="0.17.230718.1-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.Memory.Qdrant" Version="0.17.230718.1-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Skills.MsGraph" Version="0.17.230718.1-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Skills.OpenAPI" Version="0.17.230718.1-preview" />
<PackageReference Include="Microsoft.SemanticKernel.Skills.Web" Version="0.17.230718.1-preview" />
<PackageReference Include="Azure.Extensions.AspNetCore.Configuration.Secrets" Version="1.2.2" />
<PackageReference Include="Microsoft.ApplicationInsights.AspNetCore" Version="2.21.0" />
<PackageReference Include="Microsoft.Identity.Web" Version="2.13.0" />
Expand Down
6 changes: 6 additions & 0 deletions webapi/Options/AzureCognitiveSearchOptions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,10 @@ public class AzureCognitiveSearchOptions
/// </summary>
[Required, NotEmptyOrWhitespace]
public string Key { get; set; } = string.Empty;

/// <summary>
/// Use ACS's vector search feature when set to true. (See https://learn.microsoft.com/en-us/azure/search/vector-search-overview)
/// Otherwise, use semantic search. (See https://learn.microsoft.com/en-us/azure/search/semantic-search-overview)
/// </summary>
public bool UseVectorSearch { get; set; } = false;
alliscode marked this conversation as resolved.
Show resolved Hide resolved
}
12 changes: 12 additions & 0 deletions webapi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,18 @@ Before you get started, make sure you have the following additional requirements
```
> To stop the container, in another terminal window run `docker container stop copilotchat; docker container rm copilotchat;`.

# (Optional) Enabling the Azure Cognitive Search Memory Store

Azure Cognitive Search can be used as a persistent memory store for Copilot Chat.
The service can be used with either its [semantic search](https://learn.microsoft.com/en-us/azure/search/semantic-search-overview)
or its [vector search](https://learn.microsoft.com/en-us/azure/search/vector-search-overview).

When using semantic search, the service will provide a high-level ingestion mechanism that abstracts away the details of how the data is ingested.
To use semantic search, make sure to enable the `Semantic Search` feature on your Azure Cognitive Search service.

When using vector search, you have more control over the ingestion but must provide the embeddings to save into the database yourself.
Contrary to semantic search, vector search makes use of the embedding configuration you set in appsettings.json.

# (Optional) Enable Application Insights telemetry

Enabling telemetry on CopilotChatApi allows you to capture data about requests to and from the API, allowing you to monitor the deployment and monitor how the application is being used.
Expand Down
21 changes: 19 additions & 2 deletions webapi/SemanticKernelExtensions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
using Microsoft.SemanticKernel.AI.Embeddings;
using Microsoft.SemanticKernel.Connectors.AI.OpenAI.TextEmbedding;
using Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch;
using Microsoft.SemanticKernel.Connectors.Memory.AzureSearch;
using Microsoft.SemanticKernel.Connectors.Memory.Qdrant;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Skills.Core;
Expand Down Expand Up @@ -142,14 +143,30 @@ private static void AddSemanticTextMemory(this IServiceCollection services)
throw new InvalidOperationException("MemoriesStore type is AzureCognitiveSearch and AzureCognitiveSearch configuration is null.");
}

services.AddSingleton<ISemanticTextMemory>(sp => new AzureCognitiveSearchMemory(config.AzureCognitiveSearch.Endpoint, config.AzureCognitiveSearch.Key));
// ACS's vector search where users provide their embeddings
if (config.AzureCognitiveSearch.UseVectorSearch)
{
services.AddSingleton<IMemoryStore>(sp =>
{
return new AzureSearchMemoryStore(config.AzureCognitiveSearch.Endpoint, config.AzureCognitiveSearch.Key);
});
services.AddScoped<ISemanticTextMemory>(sp => new SemanticTextMemory(
sp.GetRequiredService<IMemoryStore>(),
sp.GetRequiredService<IOptions<AIServiceOptions>>().Value
.ToTextEmbeddingsService(logger: sp.GetRequiredService<ILogger<AIServiceOptions>>())));
}
// ACS's semantic search where ACS calculates the embeddings
else
{
services.AddSingleton<ISemanticTextMemory>(sp => new AzureCognitiveSearchMemory(config.AzureCognitiveSearch.Endpoint, config.AzureCognitiveSearch.Key));
}
break;

default:
throw new InvalidOperationException($"Invalid 'MemoriesStore' type '{config.Type}'.");
}

// High level semantic memory implementations, such as Azure Cognitive Search, do not allow for providing embeddings when storing memories.
// High level semantic memory implementations, such as Azure Cognitive Search's Semantic Search, do not allow for providing embeddings when storing memories.
// We wrap the memory store in an optional memory store to allow controllers to pass dependency injection validation and potentially optimize
// for a lower-level memory implementation (e.g. Qdrant). Lower level memory implementations (i.e., IMemoryStore) allow for reusing embeddings,
// whereas high level memory implementation (i.e., ISemanticTextMemory) assume embeddings get recalculated on every write.
Expand Down
2 changes: 0 additions & 2 deletions webapi/ServiceExtensions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,6 @@ internal static IServiceCollection AddOptions(this IServiceCollection services,
.ValidateOnStart()
.PostConfigure(TrimStringProperties);

var foo = services.BuildServiceProvider().GetService<IOptions<AIServiceOptions>>();

// Authorization configuration
services.AddOptions<AuthorizationOptions>()
.Bind(configuration.GetSection(AuthorizationOptions.PropertyName))
Expand Down
15 changes: 10 additions & 5 deletions webapi/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -103,11 +103,15 @@
}
},
//
// Memories stores are used for storing new memories and retrieving semantically similar memories.
// Memory stores are used for storing new memories and retrieving semantically similar memories.
// - Supported Types are "volatile", "qdrant", or "azurecognitivesearch".
// - When using Qdrant or Azure Cognitive Search, see ./README.md for deployment instructions.
// - The "Semantic Search" feature must be enabled on Azure Cognitive Search.
// - The Embedding configuration above will not be used when Azure Cognitive Search is selected.
// - If Azure Cognitive Search is selected AND UseVectorSearch is set to false, you will be
// using ACS's semantic search (https://learn.microsoft.com/en-us/azure/search/semantic-search-overview)
// instead of its vector search feature (https://learn.microsoft.com/en-us/azure/search/vector-search-overview)
// This means that:
// * The "Semantic Search" feature must be enabled on Azure Cognitive Search.
// * The Embedding configuration above will not be used.
// - Set "MemoriesStore:AzureCognitiveSearch:Key" using dotnet's user secrets (see above)
// (i.e. dotnet user-secrets set "MemoriesStore:AzureCognitiveSearch:Key" "MY_AZCOGSRCH_KEY")
// - Set "MemoriesStore:Qdrant:Key" using dotnet's user secrets (see above) if you are using a Qdrant Cloud instance.
Expand All @@ -122,6 +126,7 @@
// "Key": ""
},
"AzureCognitiveSearch": {
"UseVectorSearch": true,
"Endpoint": ""
// "Key": ""
}
Expand Down Expand Up @@ -150,8 +155,8 @@
// - Obtain language data files here: https:/tesseract-ocr/tessdata .
// - Add these files to your `data` folder or the path specified in the "FilePath" property and set the "Copy to Output Directory" value to "Copy if newer".
//
"OcrSupport": {
"Type": "tesseract",
"OcrSupport": {
"Type": "none",
glahaye marked this conversation as resolved.
Show resolved Hide resolved
"Tesseract": {
"Language": "eng",
"FilePath": "./data"
Expand Down