Spaces:
Runtime error
Runtime error
KonradSzafer
commited on
Commit
•
799264a
1
Parent(s):
c17e4d3
added discord bot to app.py
Browse files- .gitattributes +2 -35
- README.md +110 -13
- app.py +41 -14
- config/.env.example +3 -0
- discord_bot/__init__.py +1 -0
- qa_engine/config.py +4 -1
.gitattributes
CHANGED
@@ -1,35 +1,2 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
-
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
5 |
-
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
6 |
-
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
-
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
-
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
-
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
-
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
-
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
12 |
-
*.model filter=lfs diff=lfs merge=lfs -text
|
13 |
-
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
14 |
-
*.npy filter=lfs diff=lfs merge=lfs -text
|
15 |
-
*.npz filter=lfs diff=lfs merge=lfs -text
|
16 |
-
*.onnx filter=lfs diff=lfs merge=lfs -text
|
17 |
-
*.ot filter=lfs diff=lfs merge=lfs -text
|
18 |
-
*.parquet filter=lfs diff=lfs merge=lfs -text
|
19 |
-
*.pb filter=lfs diff=lfs merge=lfs -text
|
20 |
-
*.pickle filter=lfs diff=lfs merge=lfs -text
|
21 |
-
*.pkl filter=lfs diff=lfs merge=lfs -text
|
22 |
-
*.pt filter=lfs diff=lfs merge=lfs -text
|
23 |
-
*.pth filter=lfs diff=lfs merge=lfs -text
|
24 |
-
*.rar filter=lfs diff=lfs merge=lfs -text
|
25 |
-
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
-
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
-
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
-
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
-
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
-
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
-
*.wasm filter=lfs diff=lfs merge=lfs -text
|
32 |
-
*.xz filter=lfs diff=lfs merge=lfs -text
|
33 |
-
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
-
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
-
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
1 |
+
# Auto detect text files and perform LF normalization
|
2 |
+
* text=auto
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
@@ -1,13 +1,110 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Hugging Face Documentation Question Answering System
|
2 |
+
|
3 |
+
A multi-interface Q&A system that uses Hugging Face's LLM and Retrieval Augmented Generation (RAG) to deliver answers based on Hugging Face documentation. Operable as an API, Discord bot, or Gradio app, it also provides links to the documentation used to formulate each answer.
|
4 |
+
|
5 |
+
# Example
|
6 |
+
![Example](./assets/example.png)
|
7 |
+
|
8 |
+
# Table of Contents
|
9 |
+
- [Setting up](#setting-up)
|
10 |
+
- [Running](#running)
|
11 |
+
- [Gradio](#gradio)
|
12 |
+
- [API](#api-serving)
|
13 |
+
- [Discord Bot](#discord-bot)
|
14 |
+
- [Indexes](#indexes-list)
|
15 |
+
- [Development instructions](#development-instructions)
|
16 |
+
|
17 |
+
## Setting up
|
18 |
+
|
19 |
+
To execute any of the available interfaces, specify the required parameters in the `.env` file based on the `.env.example` located in the `config/` directory. Alternatively, you can set these as environment variables:
|
20 |
+
|
21 |
+
- `QUESTION_ANSWERING_MODEL_ID` - (str) A string that specifies either the model ID from the Hugging Face Hub or the directory containing the model weights
|
22 |
+
- `EMBEDDING_MODEL_ID` - (str) embedding model ID from the Hugging Face Hub. We recommend using the `hkunlp/instructor-large`
|
23 |
+
- `INDEX_REPO_ID` - (str) Repository ID from the Hugging Face Hub where the index is stored. List of the most actual indexes can be found in this section: [Indexes](#indexes-list)
|
24 |
+
- `PROMPT_TEMPLATE_NAME` - (str) Name of the model prompt template used for question answering, templates are stored in the `config/api/prompt_templates` directory
|
25 |
+
- `USE_DOCS_FOR_CONTEXT` - (bool) Use retrieved documents as a context for a given query
|
26 |
+
- `NUM_RELEVANT_DOCS` - (int) Number of documents used for the previous feature
|
27 |
+
- `ADD_SOURCES_TO_RESPONSE` - (bool) Include sources of the retrieved documents used as a context for a given query
|
28 |
+
- `USE_MESSAGES_IN_CONTEXT` - (bool) Use chat history for conversational experience
|
29 |
+
- `DEBUG` - (bool) Provides additional logging
|
30 |
+
|
31 |
+
Install the necessary dependencies from the requirements file:
|
32 |
+
|
33 |
+
```bash
|
34 |
+
pip install -r requirements.txt
|
35 |
+
```
|
36 |
+
|
37 |
+
## Running
|
38 |
+
|
39 |
+
### Gradio
|
40 |
+
|
41 |
+
After completing all steps as described in the [Setting up](#setting-up) section, run:
|
42 |
+
|
43 |
+
```bash
|
44 |
+
python3 app.py
|
45 |
+
```
|
46 |
+
|
47 |
+
### API Serving
|
48 |
+
|
49 |
+
By default, the API is served at `http://0.0.0.0:8000`. To launch it, complete all the steps outlined in the [Setting up](#setting-up) section, then execute the following command:
|
50 |
+
|
51 |
+
```bash
|
52 |
+
python3 -m api
|
53 |
+
```
|
54 |
+
|
55 |
+
### Discord Bot
|
56 |
+
|
57 |
+
To interact with the system as a Discord bot, add additional required environment variables from the `Discord bot` section of the `.env.example` file in the `config/` directory.
|
58 |
+
|
59 |
+
- `DISCORD_TOKEN` - (str) API key for the bot application
|
60 |
+
- `QA_SERVICE_URL` - (str) URL of the API service. We recommend using: `http://0.0.0.0:8000`
|
61 |
+
- `NUM_LAST_MESSAGES` - (int) Number of messages used for context in conversations
|
62 |
+
- `USE_NAMES_IN_CONTEXT` - (bool) Include usernames in the conversation context
|
63 |
+
- `ENABLE_COMMANDS` - (bool) Allow commands, e.g., channel cleanup
|
64 |
+
- `DEBUG` - (bool) Provides additional logging
|
65 |
+
|
66 |
+
After completing all steps, run:
|
67 |
+
|
68 |
+
```bash
|
69 |
+
python3 -m bot
|
70 |
+
```
|
71 |
+
|
72 |
+
<!-- ### Running in a Docker
|
73 |
+
|
74 |
+
Tu run API and bot in a Docker container, run the following command:
|
75 |
+
|
76 |
+
```bash
|
77 |
+
./run_docker.sh
|
78 |
+
``` -->
|
79 |
+
|
80 |
+
## Indexes List
|
81 |
+
|
82 |
+
The following list contains the most current indexes that can be used for the system:
|
83 |
+
- [All Hugging Face repositories over 50 Stars - 512-Character Chunks](https://huggingface.co/datasets/KonradSzafer/index-instructor-large-512-m512-all_repos_above_50_stars)
|
84 |
+
- [All Hugging Face repositories over 50 Stars - 812-Character Chunks](KonradSzafer/index-instructor-large-812-m512-all_repos_above_50_stars)
|
85 |
+
|
86 |
+
## Development Instructions
|
87 |
+
|
88 |
+
We use `Python 3.10`
|
89 |
+
|
90 |
+
To install all necessary Python packages, run the following command:
|
91 |
+
|
92 |
+
```bash
|
93 |
+
pip install -r requirements.txt
|
94 |
+
```
|
95 |
+
We use the pipreqsnb to generate the requirements.txt file. To install pipreqsnb, run the following command:
|
96 |
+
|
97 |
+
```bash
|
98 |
+
pip install pipreqsnb
|
99 |
+
```
|
100 |
+
To generate the requirements.txt file, run the following command:
|
101 |
+
|
102 |
+
```bash
|
103 |
+
pipreqsnb --force .
|
104 |
+
```
|
105 |
+
|
106 |
+
To run unit tests, you can use the following command:
|
107 |
+
|
108 |
+
```bash
|
109 |
+
pytest -o "testpaths=tests" --noconftest
|
110 |
+
```
|
app.py
CHANGED
@@ -1,10 +1,11 @@
|
|
1 |
import gradio as gr
|
2 |
|
3 |
from qa_engine import logger, Config, QAEngine
|
|
|
4 |
|
5 |
|
6 |
config = Config()
|
7 |
-
|
8 |
llm_model_id=config.question_answering_model_id,
|
9 |
embedding_model_id=config.embedding_model_id,
|
10 |
index_repo_id=config.index_repo_id,
|
@@ -15,19 +16,45 @@ model = QAEngine(
|
|
15 |
debug=config.debug
|
16 |
)
|
17 |
|
18 |
-
with gr.Blocks() as demo:
|
19 |
-
chatbot = gr.Chatbot()
|
20 |
-
msg = gr.Textbox()
|
21 |
-
clear = gr.ClearButton([msg, chatbot])
|
22 |
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
chat_history.append((message, bot_message))
|
29 |
-
return '', chat_history
|
30 |
|
31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
import gradio as gr
|
2 |
|
3 |
from qa_engine import logger, Config, QAEngine
|
4 |
+
from discord_bot import DiscordClient
|
5 |
|
6 |
|
7 |
config = Config()
|
8 |
+
qa_engine = QAEngine(
|
9 |
llm_model_id=config.question_answering_model_id,
|
10 |
embedding_model_id=config.embedding_model_id,
|
11 |
index_repo_id=config.index_repo_id,
|
|
|
16 |
debug=config.debug
|
17 |
)
|
18 |
|
|
|
|
|
|
|
|
|
19 |
|
20 |
+
def gradio_interface():
|
21 |
+
with gr.Blocks() as demo:
|
22 |
+
chatbot = gr.Chatbot()
|
23 |
+
msg = gr.Textbox()
|
24 |
+
clear = gr.ClearButton([msg, chatbot])
|
|
|
|
|
25 |
|
26 |
+
def respond(message, chat_history):
|
27 |
+
context = ''.join(f'User: {msg} \nBot:{bot_msg}\n' for msg, bot_msg in chat_history)
|
28 |
+
logger.info(f'Context: {context}')
|
29 |
+
response = qa_engine.get_response(message, context)
|
30 |
+
bot_message = response.get_answer() + response.get_sources_as_text() + '\n'
|
31 |
+
chat_history.append((message, bot_message))
|
32 |
+
return '', chat_history
|
33 |
|
34 |
+
msg.submit(respond, [msg, chatbot], [msg, chatbot])
|
35 |
+
demo.launch(share=True)
|
36 |
+
|
37 |
+
|
38 |
+
def discord_bot():
|
39 |
+
client = DiscordClient(
|
40 |
+
qa_engine=qa_engine,
|
41 |
+
num_last_messages=config.num_last_messages,
|
42 |
+
use_names_in_context=config.use_names_in_context,
|
43 |
+
enable_commands=config.enable_commands,
|
44 |
+
debug=config.debug
|
45 |
+
)
|
46 |
+
with gr.Blocks() as demo:
|
47 |
+
gr.Markdown(f'Discord bot is running.')
|
48 |
+
client.run(config.discord_token)
|
49 |
+
|
50 |
+
|
51 |
+
if __name__ == '__main__':
|
52 |
+
if config.app_mode == 'gradio':
|
53 |
+
gradio_interface()
|
54 |
+
elif config.app_mode == 'discord':
|
55 |
+
discord_bot()
|
56 |
+
else:
|
57 |
+
raise ValueError(
|
58 |
+
f'Invalid app mode: {config.app_mode}, ',
|
59 |
+
f'set APP_MODE to "gradio" or "discord"'
|
60 |
+
)
|
config/.env.example
CHANGED
@@ -14,3 +14,6 @@ DISCORD_TOKEN=your-bot-token
|
|
14 |
NUM_LAST_MESSAGES=1
|
15 |
USE_NAMES_IN_CONTEXT=False
|
16 |
ENABLE_COMMANDS=True
|
|
|
|
|
|
|
|
14 |
NUM_LAST_MESSAGES=1
|
15 |
USE_NAMES_IN_CONTEXT=False
|
16 |
ENABLE_COMMANDS=True
|
17 |
+
|
18 |
+
# App mode
|
19 |
+
APP_MODE=gradio
|
discord_bot/__init__.py
CHANGED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
from .client import DiscordClient
|
qa_engine/config.py
CHANGED
@@ -35,11 +35,14 @@ class Config:
|
|
35 |
debug: bool = eval(get_env('DEBUG', 'True'))
|
36 |
|
37 |
# Discord bot config - optional
|
38 |
-
discord_token: str = get_env('DISCORD_TOKEN', '', warn=False)
|
39 |
num_last_messages: int = int(get_env('NUM_LAST_MESSAGES', 2, warn=False))
|
40 |
use_names_in_context: bool = eval(get_env('USE_NAMES_IN_CONTEXT', 'False', warn=False))
|
41 |
enable_commands: bool = eval(get_env('ENABLE_COMMANDS', 'True', warn=False))
|
42 |
|
|
|
|
|
|
|
43 |
def __post_init__(self):
|
44 |
prompt_template_file = f'config/prompt_templates/{self.prompt_template_name}.txt'
|
45 |
with open(prompt_template_file, 'r') as f:
|
|
|
35 |
debug: bool = eval(get_env('DEBUG', 'True'))
|
36 |
|
37 |
# Discord bot config - optional
|
38 |
+
discord_token: str = get_env('DISCORD_TOKEN', '-', warn=False)
|
39 |
num_last_messages: int = int(get_env('NUM_LAST_MESSAGES', 2, warn=False))
|
40 |
use_names_in_context: bool = eval(get_env('USE_NAMES_IN_CONTEXT', 'False', warn=False))
|
41 |
enable_commands: bool = eval(get_env('ENABLE_COMMANDS', 'True', warn=False))
|
42 |
|
43 |
+
# App mode
|
44 |
+
app_mode: str = get_env('APP_MODE', '-', warn=False) # 'gradio' or 'discord'
|
45 |
+
|
46 |
def __post_init__(self):
|
47 |
prompt_template_file = f'config/prompt_templates/{self.prompt_template_name}.txt'
|
48 |
with open(prompt_template_file, 'r') as f:
|