Add evaluation root project #501

kbeaugrand · 2024-05-18T07:34:35Z

Implementing Evaluation Based on RAGAS Framework

Description

This pull request marks the beginning of our implementation of evaluation metrics for our Retrieval Augmented Generation (RAG) pipelines using the RAGAS framework.

Background

RAGAS (RAG Assessment) is a comprehensive framework designed to evaluate RAG pipelines. RAG pipelines utilize external data to enhance the context provided to Large Language Models (LLMs). While building these pipelines is facilitated by existing tools, evaluating their performance quantitatively remains a challenge. RAGAS addresses this gap by offering tools based on cutting-edge research to evaluate LLM-generated text and provide valuable insights into the effectiveness of RAG pipelines.

Features to be Implemented

The implementation will leverage Kernel Memory to deliver the following evaluation features:

Faithfulness: Ensuring the generated text accurately represents the source information.
Answer Relevancy: Assessing the pertinence of the answer in relation to the query.
Context Recall: Measuring the proportion of relevant context retrieved.
Context Precision: Evaluating the accuracy of the retrieved context.
Context Relevancy: Determining the relevance of the provided context to the query.
Context Entity Recall: Checking the retrieval of key entities within the context.
Answer Semantic Similarity: Comparing the semantic similarity between the generated answer and the expected answer.
Answer Correctness: Verifying the factual correctness of the generated answers.

Integration

RAGAS will be integrated into our CI/CD pipeline to enable continuous performance monitoring and evaluation of our RAG pipelines. This integration will ensure that our RAG systems consistently meet the desired performance benchmarks.

Next Steps

Implement evaluation metrics: Develop the specified evaluation features using Kernel Memory.
Unit tests: Tests the framework.
Integrate with CI/CD: Configure the evaluation checks to run automatically in our CI/CD pipeline.

applications/evaluation/Evaluation.csproj

dluc · 2024-05-18T17:51:49Z

Looks like the Release build is broken, maybe something's been removed from the solution?

kbeaugrand · 2024-05-18T18:16:03Z

Looks like the Release build is broken, maybe something's been removed from the solution?

I'll take a look asap.

dluc · 2024-05-18T18:26:02Z

Looks like the Release build is broken, maybe something's been removed from the solution?

I'll take a look asap.

no worries I just pushed a fix

Add evaluation root project

314aa83

kbeaugrand requested a review from dluc as a code owner May 18, 2024 07:34

dluc reviewed May 18, 2024

View reviewed changes

applications/evaluation/Evaluation.csproj Outdated Show resolved Hide resolved

Update applications/evaluation/Evaluation.csproj

412f856

dluc previously approved these changes May 18, 2024

View reviewed changes

Fix build

deb5f10

dluc dismissed their stale review via deb5f10 May 18, 2024 18:24

dluc approved these changes May 18, 2024

View reviewed changes

dluc merged commit d34b750 into microsoft:main May 18, 2024
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add evaluation root project #501

Add evaluation root project #501

kbeaugrand commented May 18, 2024 •

edited

Loading

dluc commented May 18, 2024 •

edited

Loading

kbeaugrand commented May 18, 2024

dluc commented May 18, 2024

Add evaluation root project #501

Add evaluation root project #501

Conversation

kbeaugrand commented May 18, 2024 • edited Loading

Implementing Evaluation Based on RAGAS Framework

Description

Background

Features to be Implemented

Integration

Next Steps

dluc commented May 18, 2024 • edited Loading

kbeaugrand commented May 18, 2024

dluc commented May 18, 2024

kbeaugrand commented May 18, 2024 •

edited

Loading

dluc commented May 18, 2024 •

edited

Loading