RAGCheck: RAG System Evaluation Tool

RAGCheck is a tool designed for evaluating and analyzing Retrieval Augmented Generation (RAG) systems. It provides automated testing, evaluation metrics, and visualization capabilities to help developers understand and improve their RAG implementations.

Core Features

  • šŸ”¬
    Automated Testing:

    Generate test questions and answers from document corpus

  • šŸ¤–
    Multiple LLM Support:

    Compatible with GPT-4o-mini, Ministral-3B, Gemini 1.5, and more

  • šŸ“Š
    Interactive Dashboard:

    Web-based visualization of evaluation results

  • āš”
    Batch Processing:

    Efficient concurrent processing of test cases

Technical Details

  • šŸ“
    Document Support:

    Flexible input handling for multiple document formats

  • šŸ“ˆ
    Evaluation Metrics:

    Binary scoring system with detailed explanations

  • šŸ”„
    Streamlined Workflow:

    From document ingestion to results visualization

  • šŸ› ļø
    Python-Based:

    Built with Python 3.11

Project Structure

ragcheck/
ā”œā”€ā”€ data/                    # Data storage
ā”‚   ā”œā”€ā”€ documents/          # Input documents
ā”‚   ā”œā”€ā”€ index_storage/      # Vector store indices
ā”‚   ā””ā”€ā”€ test.csv           # Generated test cases
ā”œā”€ā”€ src/                    # Source code
ā”‚   ā”œā”€ā”€ evaluation/        # Evaluation system
ā”‚   ā”œā”€ā”€ llms/             # LLM providers
ā”‚   ā”œā”€ā”€ rag/              # RAG implementation
ā”‚   ā”œā”€ā”€ scripts/          # Utility scripts
ā”‚   ā”œā”€ā”€ utils/            # Helper functions
ā”‚   ā””ā”€ā”€ visualization/    # Results dashboard