File size: 3,212 Bytes
626928c
 
 
 
 
 
 
 
 
 
 
 
15a7d9b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
title: BlogpostQA Retrieval Bot
emoji: 
colorFrom: purple
colorTo: pink
sdk: streamlit
sdk_version: 1.38.0
app_file: app.py
pinned: false
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# 🤖 Chatbot with URL-based Document Retrieval

This Streamlit app is designed to provide a chatbot interface for querying content from a blog post URL. It utilizes various NLP models to retrieve and summarize relevant content based on user queries.

## Features

- **URL Input**: Allows users to input the URL of a blog post.
- **API Key Input**: Users can either use pre-provided API keys or enter their own.
- **Chat Interface**: Users can ask questions related to the content of the blog post.
- **Dynamic Styling**: Includes colorful, animated backgrounds and a stylish sidebar.
- **Response Generation**: Uses models to retrieve and summarize content related to the query.

## Installation

To run this app, ensure you have Python installed, then install the necessary packages using the provided `requirements.txt` file.

1. **Clone the repository**:
    ```bash
    git clone https://github.com/mldatascientist23/Generative_AI_Projects.git
    ```

2. **Navigate to the project directory**:
    ```bash
    cd your-repository
    ```

3. **Create and activate a virtual environment** (optional but recommended):
    ```bash
    python -m venv env
    source env/bin/activate  # On Windows use: .\env\Scripts\activate
    ```

4. **Install the required packages**:
    ```bash
    pip install -r requirements.txt
    ```

## Usage

1. **Run the Streamlit app**:
    ```bash
    streamlit run app.py
    ```

2. **Access the app**:
    Open a web browser and navigate to `https://huggingface.co/spaces/datascientist22/blogpostQA-retrieval-bot` to interact with the chatbot.

### Sidebar Configuration

- **Enter Blog Post URL**: Input the URL of the blog post you want to retrieve data from.
- **Use pre-provided API keys**: Check this box if you want to use pre-provided API keys. If unchecked, enter your own API keys.
- **API Key Fields**: Enter your LangChain and Groq API keys if not using pre-provided keys.

### Main Interface

- **Ask a question based on the blog post**: Type your question in the input field and click "Submit Query" to get a response based on the content of the blog post.

## Error Handling

- **URL Validation**: Ensure that the URL is valid before submitting.
- **API Key Submission**: Both API keys must be provided unless using pre-provided keys.

## Notes

- Ensure that the `requirements.txt` file includes the necessary libraries, such as `streamlit`, `langchain`, `langchain_chroma`, `langchain_community`, `langchain_core`, `langchain_text_splitters`, `sentence_transformers`, `torch`, and `transformers`.
- The app uses the `facebook/bart-large-cnn` model for summarization, and `SentenceTransformer` for embedding.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Author

- **Created by [Engr. Hamesh Raj](https://www.linkedin.com/in/datascientisthameshraj/)**

Feel free to open an issue or submit a pull request if you have any questions or suggestions!