RAGCheck: RAG System Evaluation Tool
RAGCheck is a tool designed for evaluating and analyzing Retrieval Augmented Generation (RAG) systems. It provides automated testing, evaluation metrics, and visualization capabilities to help developers understand and improve their RAG implementations.
Core Features
- š¬Automated Testing:
Generate test questions and answers from document corpus
- š¤Multiple LLM Support:
Compatible with GPT-4o-mini, Ministral-3B, Gemini 1.5, and more
- šInteractive Dashboard:
Web-based visualization of evaluation results
- ā”Batch Processing:
Efficient concurrent processing of test cases
Technical Details
- šDocument Support:
Flexible input handling for multiple document formats
- šEvaluation Metrics:
Binary scoring system with detailed explanations
- šStreamlined Workflow:
From document ingestion to results visualization
- š ļøPython-Based:
Built with Python 3.11
Project Structure
ragcheck/ āāā data/ # Data storage ā āāā documents/ # Input documents ā āāā index_storage/ # Vector store indices ā āāā test.csv # Generated test cases āāā src/ # Source code ā āāā evaluation/ # Evaluation system ā āāā llms/ # LLM providers ā āāā rag/ # RAG implementation ā āāā scripts/ # Utility scripts ā āāā utils/ # Helper functions ā āāā visualization/ # Results dashboard