Spaces:
Paused
Paused
metadata
title: Mistral
emoji: ⚡
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 4.36.1
app_file: app.py
pinned: false
license: apache-2.0
AI-powered Web Search and PDF Chat Assistant
This project combines the power of large language models with web search capabilities and PDF document analysis to create a versatile chat assistant. Users can interact with their uploaded PDF documents or leverage web search to get informative responses to their queries.
Features
- PDF Document Chat: Upload and interact with multiple PDF documents.
- Web Search Integration: Option to use web search for answering queries.
- Multiple AI Models: Choose from a selection of powerful language models.
- Customizable Responses: Adjust temperature and API call settings for fine-tuned outputs.
- User-friendly Interface: Built with Gradio for an intuitive chat experience.
- Document Selection: Choose which uploaded documents to include in your queries.
How It Works
Document Processing:
- Upload PDF documents using either PyPDF or LlamaParse.
- Documents are processed and stored in a FAISS vector database for efficient retrieval.
Embedding:
- Utilizes HuggingFace embeddings (default: 'sentence-transformers/all-mpnet-base-v2') for document indexing and query matching.
Query Processing:
- For PDF queries, relevant document sections are retrieved from the FAISS database.
- For web searches, results are fetched using the DuckDuckGo search API.
Response Generation:
- Queries are processed using the selected AI model (options include Mistral, Mixtral, and others).
- Responses are generated based on the retrieved context (from PDFs or web search).
User Interaction:
- Users can chat with the AI, asking questions about uploaded documents or general queries.
- The interface allows for adjusting model parameters and switching between PDF and web search modes.
Setup and Usage
- Install the required dependencies (list of dependencies to be added).
- Set up the necessary API keys and tokens in your environment variables.
- Run the main script to launch the Gradio interface.
- Upload PDF documents using the file input at the top of the interface.
- Select documents to query using the checkboxes.
- Toggle between PDF chat and web search modes as needed.
- Adjust temperature and number of API calls to fine-tune responses.
- Start chatting and asking questions!
Models
The project supports multiple AI models, including:
- mistralai/Mistral-7B-Instruct-v0.3
- mistralai/Mixtral-8x7B-Instruct-v0.1
- meta/llama-3.1-8b-instruct
- mistralai/Mistral-Nemo-Instruct-2407
Future Improvements
- Integration of more embedding models for improved performance.
- Enhanced PDF parsing capabilities.
- Support for additional file formats beyond PDF.
- Improved caching for faster response times.
Contribution
Contributions to this project are welcome! Please feel free to submit issues or pull requests on the project's GitHub repository.