Skip to content

RAG Implementation

Retrieval-Augmented Generation

Overview

The RAG system combines semantic search capabilities with a language model to provide accurate and relevant information about travel destinations and establishments based on user reviews. The system uses vector embeddings to find semantically similar reviews and then uses a language model to generate natural language responses.

Components

1. Embedding Model

The system uses the intfloat/multilingual-e5-base as most basic efficient model for generating embeddings. This model is particularly well-suited for multilingual text and provides high-quality semantic representations.

Key features: - Supports multiple languages - Efficient base model architecture - Good performance on semantic similarity tasks

2. Vector Database

The system uses Qdrant as the vector database to store and retrieve embeddings. Each review is stored with the following metadata: - Review text (vectorized) - Name - Rubrics - Rating (1-5) - Address

3. Search Tool

The TravelReviewQueryTool implements the core retrieval functionality with the following features:

  • Semantic search using embeddings
  • Filtering capabilities:
  • Minimum rating filter
  • Address/location filter
  • Category/rubric filter
  • Configurable retrieval limit (default: 5 results)

4. Language Model Integration

The system supports multiple language models through different frameworks using ReAct agent pattern: 1. SmolAgents 2. LangChain

Usage

Basic Query Example

# Initialize the search tool
review_search_tool = TravelReviewQueryTool(
    embed_model_name="intfloat/multilingual-e5-base",
    qdrant_client=qdrant_client,
    collection_name="moskva_intfloat_multilingual_e5_base"
)

# Example query
results = review_search_tool.forward(
    query="посоветуй хороший японский ресторан в Москве",
    min_rating=4
)

Available Filters

  1. Query: Natural language query for semantic search
  2. Min Rating: Filter establishments by minimum rating (1-5)
  3. Address: Filter by location (city or street)
  4. Rubrics: Filter by establishment categories

Evaluation

The system includes evaluation capabilities to measure performance:

  1. Relevancy Evaluation
  2. Uses Gemini model for evaluation
  3. Measures if retrieved results are relevant to the query
  4. Provides relevancy scores and explanations

  5. Hallucination Detection

  6. Evaluates if generated responses contain factual information
  7. Uses predefined templates for consistency

Performance Metrics

The system has been evaluated with the following metrics: - Mean relevancy score: ~0.93 - High consistency in providing relevant results

Best Practices

  1. Query Formulation
  2. Use natural language queries
  3. Be specific about requirements
  4. Include location when relevant

  5. Response Generation

  6. Don't repeat entire review text
  7. Summarize key points
  8. Focus on relevant information

  9. Error Handling

  10. Handle cases when no results are found
  11. Provide helpful feedback to users
  12. Use appropriate fallback strategies

Future Improvements

  1. Model Enhancements
  2. Experiment with different embedding models
  3. Fine-tune models on travel-specific data
  4. Implement hybrid search strategies

  5. Feature Additions

  6. Add more sophisticated filtering options
  7. Implement result ranking improvements
  8. Add support for more languages

  9. Evaluation

  10. Expand evaluation metrics
  11. Add more comprehensive testing
  12. Implement automated quality checks