chagu-demo / rag_sec /README.md
talexm
update
eb579c5
|
raw
history blame
2.12 kB

Workflow

The system follows a well-structured workflow to ensure accurate, secure, and context-aware responses to user queries:

1. Input Query

  • A user provides a query that can be a general question, ambiguous statement, or potentially malicious intent.

2. Detection Module

  • Purpose: Classify the query as "bad" or "good."
  • Steps:
    1. Use a sentiment analysis model (distilbert-base-uncased-finetuned-sst-2-english) to detect malicious or inappropriate intent.
    2. If the query is classified as "bad" (e.g., SQL injection or inappropriate tone), block further processing and provide a warning message.
    3. If "good," proceed to the Transformation Module.

3. Transformation Module

  • Purpose: Rephrase or enhance ambiguous or poorly structured queries for better retrieval.
  • Steps:
    1. Identify missing context or ambiguous phrasing.
    2. Transform the query using:
      • Rule-based transformations for simple fixes.
      • Text-to-text models (e.g., google/flan-t5-small) for more sophisticated rephrasing.
    3. Pass the transformed query to the RAG Pipeline.

4. RAG Pipeline

  • Purpose: Retrieve relevant data and generate a context-aware response.
  • Steps:
    1. Document Retrieval:
      • Encode the transformed query and documents into embeddings using all-MiniLM-L6-v2.
      • Compute semantic similarity between the query and stored documents.
      • Retrieve the top-k documents relevant to the query.
    2. Response Generation:
      • Use the retrieved documents as context.
      • Pass the query and context to a generative model (e.g., distilgpt2) to synthesize a meaningful response.

5. Semantic Response Generation

  • Purpose: Provide a concise and meaningful answer.
  • Steps:
    1. Combine the retrieved documents into a coherent context.
    2. Generate a response tailored to the query using the generative model.
    3. Return the response to the user, ensuring clarity and relevance.

End-to-End Example

Input Query:

"How to improve acting skills?"