Skip to content

Commit

Permalink
docs: update retriever template, add arxiv retriever (#24947)
Browse files Browse the repository at this point in the history
  • Loading branch information
ccurme authored Aug 1, 2024
1 parent db3ceb4 commit 9cb69a8
Show file tree
Hide file tree
Showing 9 changed files with 234 additions and 189 deletions.
313 changes: 158 additions & 155 deletions docs/docs/integrations/retrievers/arxiv.ipynb

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions docs/docs/integrations/retrievers/azure_ai_search.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[AzureAISearchRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.azure_ai_search.AzureAISearchRetriever.html) | ✅ | ❌ | ✅ | langchain_community.retrievers |\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[AzureAISearchRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.azure_ai_search.AzureAISearchRetriever.html) | ❌ | ✅ | langchain_community |\n",
"\n",
"\n",
"## Setup\n",
Expand Down
6 changes: 3 additions & 3 deletions docs/docs/integrations/retrievers/bedrock.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[AmazonKnowledgeBasesRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html) | ✅ | ❌ | ✅ | langchain_aws.retrievers |\n"
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[AmazonKnowledgeBasesRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html) | ❌ | ✅ | langchain_aws |\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[ElasticsearchRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_elasticsearch.retrievers.ElasticsearchRetriever.html) | ✅ | ✅ | ✅ | langchain_elasticsearch |\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[ElasticsearchRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_elasticsearch.retrievers.ElasticsearchRetriever.html) | ✅ | ✅ | langchain_elasticsearch |\n",
"\n",
"\n",
"## Setup\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[VertexAISearchRetriever](https://api.python.langchain.com/en/latest/vertex_ai_search/langchain_google_community.vertex_ai_search.VertexAISearchRetriever.html) | ✅ | ❌ | ✅ | langchain_google_community.vertex_ai_search |\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[VertexAISearchRetriever](https://api.python.langchain.com/en/latest/vertex_ai_search/langchain_google_community.vertex_ai_search.VertexAISearchRetriever.html) | ❌ | ✅ | langchain_google_community |\n",
"\n",
"\n",
"## Setup\n",
Expand Down
29 changes: 20 additions & 9 deletions docs/docs/integrations/retrievers/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,25 @@ For specifics on how to use retrievers, see the [relevant how-to guides here](/d

Note that all [vector stores](/docs/concepts/#vector-stores) can be [cast to retrievers](/docs/how_to/vectorstore_retriever/).
Refer to the vector store [integration docs](/docs/integrations/vectorstores/) for available vector stores.
This table lists custom retrievers, implemented via subclassing [BaseRetriever](/docs/how_to/custom_retriever/).
This page lists custom retrievers, implemented via subclassing [BaseRetriever](/docs/how_to/custom_retriever/).

## Bring-your-own documents

| Retriever | Bring your own docs | Self-host | Cloud offering | Package |
|-----------|---------------------|-----------|----------------|---------|
| [AmazonKnowledgeBasesRetriever](/docs/integrations/retrievers/bedrock) |||| [langchain_aws](https://api.python.langchain.com/en/latest/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html) |
| [AzureAISearchRetriever](/docs/integrations/retrievers/azure_ai_search) |||| [langchain_community](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.azure_ai_search.AzureAISearchRetriever.html) |
| [ElasticsearchRetriever](/docs/integrations/retrievers/elasticsearch_retriever) |||| [langchain_elasticsearch](https://api.python.langchain.com/en/latest/retrievers/langchain_elasticsearch.retrievers.ElasticsearchRetriever.html) |
| [MilvusCollectionHybridSearchRetriever](/docs/integrations/retrievers/milvus_hybrid_search) |||| [langchain_milvus](https://api.python.langchain.com/en/latest/retrievers/langchain_milvus.retrievers.milvus_hybrid_search.MilvusCollectionHybridSearchRetriever.html) |
| [TavilySearchAPIRetriever](/docs/integrations/retrievers/tavily) |||| [langchain_community](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.tavily_search_api.TavilySearchAPIRetriever.html) |
| [VertexAISearchRetriever](/docs/integrations/retrievers/google_vertex_ai_search) |||| [langchain_google_community](https://api.python.langchain.com/en/latest/vertex_ai_search/langchain_google_community.vertex_ai_search.VertexAISearchRetriever.html) |
The below retrievers allow you to index and search a custom corpus of documents.

| Retriever | Self-host | Cloud offering | Package |
|-----------|-----------|----------------|---------|
| [AmazonKnowledgeBasesRetriever](/docs/integrations/retrievers/bedrock) ||| [langchain_aws](https://api.python.langchain.com/en/latest/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html) |
| [AzureAISearchRetriever](/docs/integrations/retrievers/azure_ai_search) ||| [langchain_community](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.azure_ai_search.AzureAISearchRetriever.html) |
| [ElasticsearchRetriever](/docs/integrations/retrievers/elasticsearch_retriever) ||| [langchain_elasticsearch](https://api.python.langchain.com/en/latest/retrievers/langchain_elasticsearch.retrievers.ElasticsearchRetriever.html) |
| [MilvusCollectionHybridSearchRetriever](/docs/integrations/retrievers/milvus_hybrid_search) ||| [langchain_milvus](https://api.python.langchain.com/en/latest/retrievers/langchain_milvus.retrievers.milvus_hybrid_search.MilvusCollectionHybridSearchRetriever.html) |
| [VertexAISearchRetriever](/docs/integrations/retrievers/google_vertex_ai_search) ||| [langchain_google_community](https://api.python.langchain.com/en/latest/vertex_ai_search/langchain_google_community.vertex_ai_search.VertexAISearchRetriever.html) |

## External index

The below retrievers will search over an external index (e.g., constructed from Internet data or similar).

| Retriever | Source | Package |
|-----------|--------|---------|
| [ArxivRetriever](/docs/integrations/retrievers/arxiv) | Scholarly articles on [arxiv.org](https://arxiv.org/) | [langchain_community](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.arxiv.ArxivRetriever.html) |
| [TavilySearchAPIRetriever](/docs/integrations/retrievers/tavily) | Internet search | [langchain_community](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.tavily_search_api.TavilySearchAPIRetriever.html) |
6 changes: 3 additions & 3 deletions docs/docs/integrations/retrievers/milvus_hybrid_search.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[MilvusCollectionHybridSearchRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_milvus.retrievers.milvus_hybrid_search.MilvusCollectionHybridSearchRetriever.html) | ✅ | ❌ | ✅ | langchain_milvus |\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[MilvusCollectionHybridSearchRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_milvus.retrievers.milvus_hybrid_search.MilvusCollectionHybridSearchRetriever.html) | ✅ | ❌ | langchain_milvus |\n",
"\n",
"\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions docs/docs/integrations/retrievers/tavily.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[TavilySearchAPIRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.tavily_search_api.TavilySearchAPIRetriever.html) | ❌ | ❌ | ❌ | langchain_community.retrievers |\n",
"| Retriever | Source | Package |\n",
"| :--- | :--- | :---: |\n",
"[TavilySearchAPIRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_community.retrievers.tavily_search_api.TavilySearchAPIRetriever.html) | Internet search | langchain_community |\n",
"\n",
"## Setup"
]
Expand Down
45 changes: 38 additions & 7 deletions libs/cli/langchain_cli/integration_template/docs/retrievers.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,19 @@
"\n",
"### Integration details\n",
"\n",
"| Retriever | Bring your own docs | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[__ModuleName__Retriever](https://api.python.langchain.com/en/latest/retrievers/__package_name__.retrievers.__module_name__.__ModuleName__Retriever.html) | ❌ | ❌ | ❌ | __package_name__ |\n",
"TODO: Select one of the tables below, as appropriate.\n",
"\n",
"1: Bring-your-own data (i.e., index and search a custom corpus of documents):\n",
"\n",
"| Retriever | Self-host | Cloud offering | Package |\n",
"| :--- | :--- | :---: | :---: |\n",
"[__ModuleName__Retriever](https://api.python.langchain.com/en/latest/retrievers/__package_name__.retrievers.__module_name__.__ModuleName__Retriever.html) | ❌ | ❌ | __package_name__ |\n",
"\n",
"2: External index (e.g., constructed from Internet data or similar)):\n",
"\n",
"| Retriever | Source | Package |\n",
"| :--- | :--- | :---: |\n",
"[__ModuleName__Retriever](https://api.python.langchain.com/en/latest/retrievers/__package_name__.retrievers.__module_name__.__ModuleName__Retriever.html) | Source description | __package_name__ |\n",
"\n",
"## Setup\n",
"\n",
Expand Down Expand Up @@ -124,7 +133,32 @@
"id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e",
"metadata": {},
"source": [
"## Use within a chain"
"## Use within a chain\n",
"\n",
"Like other retrievers, __ModuleName__Retriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).\n",
"\n",
"We will need a LLM or chat model:\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"llm\" />\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25b647a3-f8f2-4541-a289-7a241e43f9df",
"metadata": {},
"outputs": [],
"source": [
"# | output: false\n",
"# | echo: false\n",
"\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
]
},
{
Expand All @@ -137,7 +171,6 @@
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\n",
" \"\"\"Answer the question based only on the context provided.\n",
Expand All @@ -147,8 +180,6 @@
"Question: {question}\"\"\"\n",
")\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\")\n",
"\n",
"\n",
"def format_docs(docs):\n",
" return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
Expand Down

0 comments on commit 9cb69a8

Please sign in to comment.