Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Add PostgresVectorStore Memory connector. #9324

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lossyrob
Copy link
Contributor

This PR adds a PostgresVectorStore and related classes to Microsoft.SemanticKernel.Connectors.Postgres.

Motivation and Context

As part of the move to having memory connectors implement the new Microsoft.Extensions.VectorData.IVectorStore architecture (see https:/microsoft/semantic-kernel/blob/main/docs/decisions/0050-updated-vector-store-design.md), each memory connector needs to be updated with the new architecture. This PR tackles updating the existing Microsoft.SemanticKernel.Connectors.Postgres package to include this implementation. This will supercede the PostgresMemoryStore implementation.

Some high level comments about design:

  • PostgresVectorStore and PostgresVectorStoreRecordCollection get injected with an IPostgresVectorStoreDbClient. This abstracts the database communication and allows for unit tests to mock database interactions.
  • The PostgresVectorStoreDbClient gets passed in a NpgsqlDataSource from the user, which is used to manage connections to the database. The responsibility of connection pool lifecycle management is on the user.
  • The IPostgresVectorStoreDbClient is designed to accept and produce the storage model, which in this case is a Dictionary<string, object?> . This is the intermediate type that is mapped to by the IVectorStoreRecordMapper.
  • The PostgresVectorStoreDbClient also takes a IPostgresVectorStoreCollectionSqlBuilder, which generates SQL command information for interacting with the database. This abstracts the SQL queries related to each task, and allows for future expansion. This is particularly targeted at creating a AzureDBForPostgre vector store that will enable alternate vector implementations like DiskANN, while leveraging the same database client as the Postgres connector.
  •  The integration tests for the vector store utilize Docker.Net to bring up a pgvector/pgvector docker container, which test are run against.

TODO:

  • Finish implementing and testing all methods of PostgresVectorStoreRecordCollection, most of which can follow the existing code and patterns except for VectorizedSearchAsync and VectorizableTextSearchAsync, which are part of the IVectorizableTextSearch interface.
  • Update the README in the package
  • Create a sample in dotnet/samples/Concepts/Memory

Contribution Checklist

Work in progress, some methods are not implemented yet.
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel memory labels Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Issues or pull requests impacting the core kernel memory .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants