Spaces:
Sleeping
Sleeping
Commit
·
7f56c87
0
Parent(s):
Duplicate from indonesian-nlp/gpt2-app
Browse filesCo-authored-by: Julien Chaumond <[email protected]>
- .gitattributes +27 -0
- .github/workflows/check-filesize.yml +17 -0
- .github/workflows/hf-spaces.yml +21 -0
- Dockerfile +7 -0
- LICENSE +21 -0
- README.md +51 -0
- app/SessionState.py +95 -0
- app/app.py +255 -0
- app/chatbot.html +134 -0
- app/css/main.css +109 -0
- app/js/main.js +122 -0
- app/prompts.py +57 -0
- requirements.txt +10 -0
.gitattributes
ADDED
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
+
*.bin.* filter=lfs diff=lfs merge=lfs -text
|
5 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
12 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
13 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
14 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
15 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
16 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
17 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
18 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
19 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
20 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
21 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
23 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
24 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
25 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
26 |
+
*.zstandard filter=lfs diff=lfs merge=lfs -text
|
27 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
.github/workflows/check-filesize.yml
ADDED
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Check file size
|
2 |
+
|
3 |
+
on: # or directly `on: [push]` to run the action on every push on any branch
|
4 |
+
pull_request:
|
5 |
+
branches: [main]
|
6 |
+
|
7 |
+
# to run this workflow manually from the Actions tab
|
8 |
+
workflow_dispatch:
|
9 |
+
|
10 |
+
jobs:
|
11 |
+
sync-to-hub:
|
12 |
+
runs-on: ubuntu-latest
|
13 |
+
steps:
|
14 |
+
- name: Check large files
|
15 |
+
uses: ActionsDesk/[email protected]
|
16 |
+
with:
|
17 |
+
filesizelimit: 10485760 # = 10MB, so we can sync to HF spaces
|
.github/workflows/hf-spaces.yml
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
name: Sync to Hugging Face hub
|
2 |
+
|
3 |
+
on:
|
4 |
+
push:
|
5 |
+
branches: [main]
|
6 |
+
|
7 |
+
# to run this workflow manually from the Actions tab
|
8 |
+
workflow_dispatch:
|
9 |
+
|
10 |
+
jobs:
|
11 |
+
sync-to-hub:
|
12 |
+
runs-on: ubuntu-latest
|
13 |
+
steps:
|
14 |
+
- uses: actions/checkout@v2
|
15 |
+
with:
|
16 |
+
fetch-depth: 0
|
17 |
+
- name: Push to hub
|
18 |
+
env:
|
19 |
+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
|
20 |
+
run: |
|
21 |
+
git push https://indonesian-nlp:[email protected]/spaces/indonesian-nlp/gpt2-app main
|
Dockerfile
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
FROM python:3.8-slim-buster
|
2 |
+
COPY . /app
|
3 |
+
WORKDIR /app
|
4 |
+
RUN pip install -r requirements.txt
|
5 |
+
EXPOSE 8501
|
6 |
+
ENTRYPOINT ["streamlit","run"]
|
7 |
+
CMD ["app/app.py"]
|
LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
MIT License
|
2 |
+
|
3 |
+
Copyright (c) 2021 indonesian-nlp
|
4 |
+
|
5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6 |
+
of this software and associated documentation files (the "Software"), to deal
|
7 |
+
in the Software without restriction, including without limitation the rights
|
8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9 |
+
copies of the Software, and to permit persons to whom the Software is
|
10 |
+
furnished to do so, subject to the following conditions:
|
11 |
+
|
12 |
+
The above copyright notice and this permission notice shall be included in all
|
13 |
+
copies or substantial portions of the Software.
|
14 |
+
|
15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21 |
+
SOFTWARE.
|
README.md
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: Indonesian GPT-2
|
3 |
+
emoji: 🦀
|
4 |
+
colorFrom: pink
|
5 |
+
colorTo: yellow
|
6 |
+
sdk: streamlit
|
7 |
+
app_file: app/app.py
|
8 |
+
pinned: false
|
9 |
+
duplicated_from: indonesian-nlp/gpt2-app
|
10 |
+
---
|
11 |
+
|
12 |
+
# Indonesian GPT-2 Applications
|
13 |
+
This is a collection of Applications that generates sentences using Indonesian GPT-2 models!
|
14 |
+
|
15 |
+
|
16 |
+
## How did we create it
|
17 |
+
|
18 |
+
## Development
|
19 |
+
|
20 |
+
### Dependencies Installation
|
21 |
+
|
22 |
+
### Inference Pipeline
|
23 |
+
|
24 |
+
## Authors
|
25 |
+
|
26 |
+
Following are the authors of this work (listed alphabetically):
|
27 |
+
- [Akmal](https://github.com/Wikidepia)
|
28 |
+
- [Alvin Watner](https://github.com/alvinwatner)
|
29 |
+
- [Cahya Wirawan](https://github.com/cahya-wirawan)
|
30 |
+
- [Galuh Sahid](https://github.com/galuhsahid)
|
31 |
+
- [Muhammad Agung Hambali](https://github.com/magungh1)
|
32 |
+
- [Samsul Rahmadani](https://github.com/acul3)
|
33 |
+
|
34 |
+
## Acknowledgements
|
35 |
+
|
36 |
+
- 🤗 Hugging Face for organizing [the FLAX/JAX community week](https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects)
|
37 |
+
- Google [TPU Research Cloud (TRC) program](https://sites.research.google/trc/) for providing computing resources
|
38 |
+
- [Weights & Biases](https://wandb.com/) for providing the infrastructure for experiment tracking and model management
|
39 |
+
|
40 |
+
## Citing Indonesian GPT-2 Applications
|
41 |
+
|
42 |
+
If you find this is useful in your research or wish to refer, please use the following BibTeX entry.
|
43 |
+
|
44 |
+
```
|
45 |
+
@misc{Indonesian_GPT2_App_2021,
|
46 |
+
author = {Akmal, Alvin Watner, Cahya Wirawan, Galuh Sahid, Muhammad Agung Hambali, Samsul Rahmadani},
|
47 |
+
title = {Indonesian GPT-2 Applications},
|
48 |
+
url = {https://github.com/indonesian-nlp/gpt2-app},
|
49 |
+
year = {2021}
|
50 |
+
}
|
51 |
+
```
|
app/SessionState.py
ADDED
@@ -0,0 +1,95 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""Hack to add per-session state to Streamlit.
|
2 |
+
Usage
|
3 |
+
-----
|
4 |
+
>>> import SessionState
|
5 |
+
>>>
|
6 |
+
>>> session_state = SessionState.get(user_name='', favorite_color='black')
|
7 |
+
>>> session_state.user_name
|
8 |
+
''
|
9 |
+
>>> session_state.user_name = 'Mary'
|
10 |
+
>>> session_state.favorite_color
|
11 |
+
'black'
|
12 |
+
Since you set user_name above, next time your script runs this will be the
|
13 |
+
result:
|
14 |
+
>>> session_state = get(user_name='', favorite_color='black')
|
15 |
+
>>> session_state.user_name
|
16 |
+
'Mary'
|
17 |
+
"""
|
18 |
+
from streamlit.scriptrunner import get_script_run_ctx
|
19 |
+
from streamlit.server.server import Server
|
20 |
+
|
21 |
+
|
22 |
+
class SessionState(object):
|
23 |
+
def __init__(self, **kwargs):
|
24 |
+
"""A new SessionState object.
|
25 |
+
Parameters
|
26 |
+
----------
|
27 |
+
**kwargs : any
|
28 |
+
Default values for the session state.
|
29 |
+
Example
|
30 |
+
-------
|
31 |
+
>>> session_state = SessionState(user_name='', favorite_color='black')
|
32 |
+
>>> session_state.user_name = 'Mary'
|
33 |
+
''
|
34 |
+
>>> session_state.favorite_color
|
35 |
+
'black'
|
36 |
+
"""
|
37 |
+
for key, val in kwargs.items():
|
38 |
+
setattr(self, key, val)
|
39 |
+
|
40 |
+
|
41 |
+
def get(**kwargs):
|
42 |
+
"""Gets a SessionState object for the current session.
|
43 |
+
Creates a new object if necessary.
|
44 |
+
Parameters
|
45 |
+
----------
|
46 |
+
**kwargs : any
|
47 |
+
Default values you want to add to the session state, if we're creating a
|
48 |
+
new one.
|
49 |
+
Example
|
50 |
+
-------
|
51 |
+
>>> session_state = get(user_name='', favorite_color='black')
|
52 |
+
>>> session_state.user_name
|
53 |
+
''
|
54 |
+
>>> session_state.user_name = 'Mary'
|
55 |
+
>>> session_state.favorite_color
|
56 |
+
'black'
|
57 |
+
Since you set user_name above, next time your script runs this will be the
|
58 |
+
result:
|
59 |
+
>>> session_state = get(user_name='', favorite_color='black')
|
60 |
+
>>> session_state.user_name
|
61 |
+
'Mary'
|
62 |
+
"""
|
63 |
+
# Hack to get the session object from Streamlit.
|
64 |
+
|
65 |
+
ctx = get_script_run_ctx()
|
66 |
+
|
67 |
+
this_session = None
|
68 |
+
|
69 |
+
current_server = Server.get_current()
|
70 |
+
if hasattr(current_server, '_session_infos'):
|
71 |
+
# Streamlit < 0.56
|
72 |
+
session_infos = Server.get_current()._session_infos.values()
|
73 |
+
else:
|
74 |
+
session_infos = Server.get_current()._session_info_by_id.values()
|
75 |
+
|
76 |
+
for session_info in session_infos:
|
77 |
+
s = session_info.session
|
78 |
+
if (
|
79 |
+
(not hasattr(s, '_main_dg') and s._uploaded_file_mgr == ctx.uploaded_file_mgr)
|
80 |
+
):
|
81 |
+
this_session = s
|
82 |
+
|
83 |
+
if this_session is None:
|
84 |
+
raise RuntimeError(
|
85 |
+
"Oh noes. Couldn't get your Streamlit Session object. "
|
86 |
+
'Are you doing something fancy with threads?')
|
87 |
+
|
88 |
+
# Got the session object! Now let's attach some state into it.
|
89 |
+
|
90 |
+
if not hasattr(this_session, '_custom_session_state'):
|
91 |
+
this_session._custom_session_state = SessionState(**kwargs)
|
92 |
+
|
93 |
+
return this_session._custom_session_state
|
94 |
+
|
95 |
+
__all__ = ['get']
|
app/app.py
ADDED
@@ -0,0 +1,255 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import streamlit as st
|
2 |
+
print("Streamlit Version: ", st.__version__)
|
3 |
+
import SessionState
|
4 |
+
from mtranslate import translate
|
5 |
+
from prompts import PROMPT_LIST
|
6 |
+
import random
|
7 |
+
import time
|
8 |
+
from transformers import pipeline, set_seed
|
9 |
+
import psutil
|
10 |
+
import codecs
|
11 |
+
import streamlit.components.v1 as stc
|
12 |
+
import pathlib
|
13 |
+
import os
|
14 |
+
|
15 |
+
# st.set_page_config(page_title="Indonesian GPT-2")
|
16 |
+
|
17 |
+
mirror_url = "https://gpt2-app.ai-research.id/"
|
18 |
+
if "MIRROR_URL" in os.environ:
|
19 |
+
mirror_url = os.environ["MIRROR_URL"]
|
20 |
+
|
21 |
+
MODELS = {
|
22 |
+
"Indonesian GPT-2 Small": {
|
23 |
+
"group": "Indonesian GPT-2",
|
24 |
+
"name": "indonesian-nlp/gpt2",
|
25 |
+
"description": "The original Indonesian GPT-2 small model.",
|
26 |
+
"text_generator": None
|
27 |
+
},
|
28 |
+
"Indonesian GPT-2 Medium": {
|
29 |
+
"group": "Indonesian GPT-2",
|
30 |
+
"name": "indonesian-nlp/gpt2-medium-indonesian",
|
31 |
+
"description": "The original Indonesian GPT-2 medium model.",
|
32 |
+
"text_generator": None
|
33 |
+
},
|
34 |
+
"Indonesian Literature - GPT-2 Small": {
|
35 |
+
"group": "Indonesian Literature",
|
36 |
+
"name": "cahya/gpt2-small-indonesian-story",
|
37 |
+
"description": "The Indonesian Literature Generator using fine-tuned GPT-2 small model.",
|
38 |
+
"text_generator": None
|
39 |
+
},
|
40 |
+
"Indonesian Literature - GPT-2 Medium": {
|
41 |
+
"group": "Indonesian Literature",
|
42 |
+
"name": "cahya/gpt2-medium-indonesian-story",
|
43 |
+
"description": "The Indonesian Literature Generator using fine-tuned GPT-2 medium model.",
|
44 |
+
"text_generator": None
|
45 |
+
},
|
46 |
+
"Indonesian Academic Journal - GPT-2 Small": {
|
47 |
+
"group": "Indonesian Journal",
|
48 |
+
"name": "Galuh/id-journal-gpt2",
|
49 |
+
"description": "The Indonesian Journal Generator using fine-tuned GPT-2 small model.",
|
50 |
+
"text_generator": None
|
51 |
+
},
|
52 |
+
"Indonesian Persona Chatbot - GPT-2 Small": {
|
53 |
+
"group": "Indonesian Persona Chatbot",
|
54 |
+
"name": "cahya/gpt2-small-indonesian-personachat",
|
55 |
+
"description": "The Indonesian Persona Chatbot using fine-tuned GPT-2 small model.",
|
56 |
+
"text_generator": None
|
57 |
+
},
|
58 |
+
"Multilingual mGPT": {
|
59 |
+
"group": "Indonesian GPT-2",
|
60 |
+
"name": "sberbank-ai/mGPT",
|
61 |
+
"description": "Multilingual GPT model, autoregressive GPT-like models with 1.3 billion parameters.",
|
62 |
+
"text_generator": None
|
63 |
+
},
|
64 |
+
}
|
65 |
+
|
66 |
+
|
67 |
+
def stc_chatbot(root_dir, width=700, height=900):
|
68 |
+
html_file = root_dir/"app/chatbot.html"
|
69 |
+
css_file = root_dir/"app/css/main.css"
|
70 |
+
js_file = root_dir/"app/js/main.js"
|
71 |
+
if css_file.exists() and js_file.exists():
|
72 |
+
html = codecs.open(html_file, "r").read()
|
73 |
+
css = codecs.open(css_file, "r").read()
|
74 |
+
js = codecs.open(js_file, "r").read()
|
75 |
+
html = html.replace('<link rel="stylesheet" href="css/main.css">', "<style>\n" + css + "\n</style>")
|
76 |
+
html = html.replace('<script src="js/main.js"></script>', "<script>\n" + js + "\n</script>")
|
77 |
+
stc.html(html, width=width, height=height, scrolling=True)
|
78 |
+
|
79 |
+
st.sidebar.markdown("""
|
80 |
+
<style>
|
81 |
+
.centeralign {
|
82 |
+
text-align: center;
|
83 |
+
}
|
84 |
+
</style>
|
85 |
+
<p class="centeralign">
|
86 |
+
<img src="https://huggingface.co/spaces/flax-community/gpt2-indonesian/resolve/main/huggingwayang.png"/>
|
87 |
+
</p>
|
88 |
+
""", unsafe_allow_html=True)
|
89 |
+
st.sidebar.markdown(f"""
|
90 |
+
___
|
91 |
+
<p class="centeralign">
|
92 |
+
This is a collection of applications that generates sentences using Indonesian GPT-2 models!
|
93 |
+
</p>
|
94 |
+
<p class="centeralign">
|
95 |
+
Created by <a href="https://huggingface.co/indonesian-nlp">Indonesian NLP</a> team @2021
|
96 |
+
<br/>
|
97 |
+
<a href="https://github.com/indonesian-nlp/gpt2-app" target="_blank">GitHub</a> | <a href="https://github.com/indonesian-nlp/gpt2-app" target="_blank">Project Report</a>
|
98 |
+
<br/>
|
99 |
+
A mirror of the application is available <a href="{mirror_url}" target="_blank">here</a>
|
100 |
+
</p>
|
101 |
+
""", unsafe_allow_html=True)
|
102 |
+
|
103 |
+
st.sidebar.markdown("""
|
104 |
+
___
|
105 |
+
""", unsafe_allow_html=True)
|
106 |
+
|
107 |
+
model = st.sidebar.selectbox('Model', (MODELS.keys()))
|
108 |
+
|
109 |
+
|
110 |
+
@st.cache(suppress_st_warning=True, allow_output_mutation=True)
|
111 |
+
def get_generator(model_name: str):
|
112 |
+
st.write(f"Loading the GPT2 model {model_name}, please wait...")
|
113 |
+
text_generator = pipeline('text-generation', model=model_name)
|
114 |
+
return text_generator
|
115 |
+
|
116 |
+
|
117 |
+
# Disable the st.cache for this function due to issue on newer version of streamlit
|
118 |
+
# @st.cache(suppress_st_warning=True, hash_funcs={tokenizers.Tokenizer: id})
|
119 |
+
def process(text_generator, text: str, max_length: int = 100, do_sample: bool = True, top_k: int = 50, top_p: float = 0.95,
|
120 |
+
temperature: float = 1.0, max_time: float = 120.0, seed=42, repetition_penalty=1.0):
|
121 |
+
# st.write("Cache miss: process")
|
122 |
+
set_seed(seed)
|
123 |
+
if repetition_penalty == 0.0:
|
124 |
+
min_penalty = 1.05
|
125 |
+
max_penalty = 1.5
|
126 |
+
repetition_penalty = max(min_penalty + (1.0-temperature) * (max_penalty-min_penalty), 0.8)
|
127 |
+
result = text_generator(text, max_length=max_length, do_sample=do_sample,
|
128 |
+
top_k=top_k, top_p=top_p, temperature=temperature,
|
129 |
+
max_time=max_time, repetition_penalty=repetition_penalty)
|
130 |
+
return result
|
131 |
+
|
132 |
+
|
133 |
+
st.title("Indonesian GPT-2 Applications")
|
134 |
+
prompt_group_name = MODELS[model]["group"]
|
135 |
+
st.header(prompt_group_name)
|
136 |
+
description = f"This application is a demo for {MODELS[model]['description']}"
|
137 |
+
st.markdown(description)
|
138 |
+
model_name = f"Model name: [{MODELS[model]['name']}](https://huggingface.co/{MODELS[model]['name']})"
|
139 |
+
st.markdown(model_name)
|
140 |
+
if prompt_group_name in ["Indonesian GPT-2", "Indonesian Literature", "Indonesian Journal"]:
|
141 |
+
session_state = SessionState.get(prompt=None, prompt_box=None, text=None)
|
142 |
+
ALL_PROMPTS = list(PROMPT_LIST[prompt_group_name].keys())+["Custom"]
|
143 |
+
|
144 |
+
prompt = st.selectbox('Prompt', ALL_PROMPTS, index=len(ALL_PROMPTS)-1)
|
145 |
+
|
146 |
+
# Update prompt
|
147 |
+
if session_state.prompt is None:
|
148 |
+
session_state.prompt = prompt
|
149 |
+
elif session_state.prompt is not None and (prompt != session_state.prompt):
|
150 |
+
session_state.prompt = prompt
|
151 |
+
session_state.prompt_box = None
|
152 |
+
session_state.text = None
|
153 |
+
else:
|
154 |
+
session_state.prompt = prompt
|
155 |
+
|
156 |
+
# Update prompt box
|
157 |
+
if session_state.prompt == "Custom":
|
158 |
+
session_state.prompt_box = ""
|
159 |
+
else:
|
160 |
+
print(f"# prompt: {session_state.prompt}")
|
161 |
+
print(f"# prompt_box: {session_state.prompt_box}")
|
162 |
+
if session_state.prompt is not None and session_state.prompt_box is None:
|
163 |
+
session_state.prompt_box = random.choice(PROMPT_LIST[prompt_group_name][session_state.prompt])
|
164 |
+
|
165 |
+
session_state.text = st.text_area("Enter text", session_state.prompt_box)
|
166 |
+
|
167 |
+
max_length = st.sidebar.number_input(
|
168 |
+
"Maximum length",
|
169 |
+
value=100,
|
170 |
+
max_value=512,
|
171 |
+
help="The maximum length of the sequence to be generated."
|
172 |
+
)
|
173 |
+
|
174 |
+
temperature = st.sidebar.slider(
|
175 |
+
"Temperature",
|
176 |
+
value=0.9,
|
177 |
+
min_value=0.0,
|
178 |
+
max_value=2.0
|
179 |
+
)
|
180 |
+
|
181 |
+
do_sample = st.sidebar.checkbox(
|
182 |
+
"Use sampling",
|
183 |
+
value=True
|
184 |
+
)
|
185 |
+
|
186 |
+
top_k = 30
|
187 |
+
top_p = 0.95
|
188 |
+
|
189 |
+
if do_sample:
|
190 |
+
top_k = st.sidebar.number_input(
|
191 |
+
"Top k",
|
192 |
+
value=top_k,
|
193 |
+
help="The number of highest probability vocabulary tokens to keep for top-k-filtering."
|
194 |
+
)
|
195 |
+
top_p = st.sidebar.number_input(
|
196 |
+
"Top p",
|
197 |
+
value=top_p,
|
198 |
+
help="If set to float < 1, only the most probable tokens with probabilities that add up to top_p or higher "
|
199 |
+
"are kept for generation."
|
200 |
+
)
|
201 |
+
|
202 |
+
seed = st.sidebar.number_input(
|
203 |
+
"Random Seed",
|
204 |
+
value=25,
|
205 |
+
help="The number used to initialize a pseudorandom number generator"
|
206 |
+
)
|
207 |
+
|
208 |
+
repetition_penalty = 0.0
|
209 |
+
automatic_repetition_penalty = st.sidebar.checkbox(
|
210 |
+
"Automatic Repetition Penalty",
|
211 |
+
value=True
|
212 |
+
)
|
213 |
+
|
214 |
+
if not automatic_repetition_penalty:
|
215 |
+
repetition_penalty = st.sidebar.slider(
|
216 |
+
"Repetition Penalty",
|
217 |
+
value=1.0,
|
218 |
+
min_value=1.0,
|
219 |
+
max_value=2.0
|
220 |
+
)
|
221 |
+
|
222 |
+
for group_name in MODELS:
|
223 |
+
if MODELS[group_name]["group"] in ["Indonesian GPT-2", "Indonesian Literature", "Indonesian Journal"]:
|
224 |
+
MODELS[group_name]["text_generator"] = get_generator(MODELS[group_name]["name"])
|
225 |
+
# text_generator = get_generator()
|
226 |
+
if st.button("Run"):
|
227 |
+
with st.spinner(text="Getting results..."):
|
228 |
+
memory = psutil.virtual_memory()
|
229 |
+
st.subheader("Result")
|
230 |
+
time_start = time.time()
|
231 |
+
# text_generator = MODELS[model]["text_generator"]
|
232 |
+
result = process(MODELS[model]["text_generator"], text=session_state.text, max_length=int(max_length),
|
233 |
+
temperature=temperature, do_sample=do_sample,
|
234 |
+
top_k=int(top_k), top_p=float(top_p), seed=seed, repetition_penalty=repetition_penalty)
|
235 |
+
time_end = time.time()
|
236 |
+
time_diff = time_end-time_start
|
237 |
+
result = result[0]["generated_text"]
|
238 |
+
st.write(result.replace("\n", " \n"))
|
239 |
+
st.text("Translation")
|
240 |
+
translation = translate(result, "en", "id")
|
241 |
+
st.write(translation.replace("\n", " \n"))
|
242 |
+
# st.write(f"*do_sample: {do_sample}, top_k: {top_k}, top_p: {top_p}, seed: {seed}*")
|
243 |
+
info = f"""
|
244 |
+
*Memory: {memory.total/(1024*1024*1024):.2f}GB, used: {memory.percent}%, available: {memory.available/(1024*1024*1024):.2f}GB*
|
245 |
+
*Text generated in {time_diff:.5} seconds*
|
246 |
+
"""
|
247 |
+
st.write(info)
|
248 |
+
|
249 |
+
# Reset state
|
250 |
+
session_state.prompt = None
|
251 |
+
session_state.prompt_box = None
|
252 |
+
session_state.text = None
|
253 |
+
elif model.startswith("Indonesian Persona Chatbot"):
|
254 |
+
root_dir = pathlib.Path(".")
|
255 |
+
stc_chatbot(root_dir)
|
app/chatbot.html
ADDED
@@ -0,0 +1,134 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<!DOCTYPE html>
|
2 |
+
<html lang="en">
|
3 |
+
<head>
|
4 |
+
<!-- Required meta tags -->
|
5 |
+
<meta charset="utf-8">
|
6 |
+
<meta name="viewport" content="width=device-width, initial-scale=1">
|
7 |
+
|
8 |
+
<!-- Bootstrap CSS -->
|
9 |
+
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-F3w7mX95PdgyTmZZMECAngseQB83DfGTowi0iMjiWaeVhAn4FJkqJByhZMI3AhiU" crossorigin="anonymous">
|
10 |
+
|
11 |
+
<title>Indonesian GPT2 Chatbot</title>
|
12 |
+
<link rel="stylesheet" href="css/main.css">
|
13 |
+
<script src="js/main.js"></script>
|
14 |
+
</head>
|
15 |
+
<body onload="pageSetup();">
|
16 |
+
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js" integrity="sha384-/bQdsTh/da6pkI1MST/rWKFNjaCP5gBSY4sEBT38Q/9RBh9AH40zEOg7Hlq2THRZ" crossorigin="anonymous"></script>
|
17 |
+
|
18 |
+
<div class="buttons" style="display: none">
|
19 |
+
<div class="minus button">-</div>
|
20 |
+
<div class="value">?</div>
|
21 |
+
<div class="plus button">+</div>
|
22 |
+
</div>
|
23 |
+
<div class="state" style="display: none">
|
24 |
+
<span class="users"></span>
|
25 |
+
</div>
|
26 |
+
|
27 |
+
<div class="container">
|
28 |
+
<div class="chat-container">
|
29 |
+
<div class="chat-messages">
|
30 |
+
<div class="messages">
|
31 |
+
</div>
|
32 |
+
</div>
|
33 |
+
</div>
|
34 |
+
<div class="chat-input input-group mb-3">
|
35 |
+
<input type="text" class="form-control user-input" placeholder="Type a message..." aria-label="User message" aria-describedby="basic-addon2">
|
36 |
+
<span class="input-group-text btn btn-primary user-input-button" id="basic-addon2">Send</span>
|
37 |
+
</div>
|
38 |
+
<!--
|
39 |
+
<div class="chat-suggestion">
|
40 |
+
Suggestion: <span class="js-loading">Loading…</span> <a class="js-suggestion hide">Kenapa kamu sedih?</a>
|
41 |
+
</div>
|
42 |
+
-->
|
43 |
+
<div class="server-message">
|
44 |
+
<span class="server-message-value"></span>
|
45 |
+
</div>
|
46 |
+
<div class="accordion" id="accordionExample">
|
47 |
+
<div class="accordion-item">
|
48 |
+
<h2 class="accordion-header" id="headingOne">
|
49 |
+
<button class="accordion-button" type="button" data-bs-toggle="collapse" data-bs-target="#collapseOne" aria-expanded="true" aria-controls="collapseOne">
|
50 |
+
Bots Personalities
|
51 |
+
</button>
|
52 |
+
</h2>
|
53 |
+
<div id="collapseOne" class="accordion-collapse collapse show" aria-labelledby="headingOne" data-bs-parent="#accordionExample">
|
54 |
+
<form class="bot-personality">
|
55 |
+
<div class="mb-3">
|
56 |
+
<div class="row g-2 align-items-center">
|
57 |
+
<div class="col-auto">
|
58 |
+
<label for="inputPersonality1" class="col-form-label">Personality 1:</label>
|
59 |
+
</div>
|
60 |
+
<div class="col-auto">
|
61 |
+
<input type="text" class="form-control" id="inputPersonality1">
|
62 |
+
</div>
|
63 |
+
</div>
|
64 |
+
<div class="row g-2 align-items-center">
|
65 |
+
<div class="col-auto">
|
66 |
+
<label for="inputPersonality2" class="col-form-label">Personality 2:</label>
|
67 |
+
</div>
|
68 |
+
<div class="col-auto">
|
69 |
+
<input type="text" class="form-control" id="inputPersonality2">
|
70 |
+
</div>
|
71 |
+
</div>
|
72 |
+
<div class="row g-2 align-items-center">
|
73 |
+
<div class="col-auto">
|
74 |
+
<label for="inputPersonality3" class="col-form-label">Personality 3:</label>
|
75 |
+
</div>
|
76 |
+
<div class="col-auto">
|
77 |
+
<input type="text" class="form-control" id="inputPersonality3">
|
78 |
+
</div>
|
79 |
+
</div>
|
80 |
+
<div class="row g-2 align-items-center">
|
81 |
+
<div class="col-auto">
|
82 |
+
<label for="inputPersonality4" class="col-form-label">Personality 4:</label>
|
83 |
+
</div>
|
84 |
+
<div class="col-auto">
|
85 |
+
<input type="text" class="form-control" id="inputPersonality4">
|
86 |
+
</div>
|
87 |
+
</div>
|
88 |
+
<div class="row g-2 align-items-center">
|
89 |
+
<div class="col-auto">
|
90 |
+
<label for="inputPersonality5" class="col-form-label">Personality 5:</label>
|
91 |
+
</div>
|
92 |
+
<div class="col-auto">
|
93 |
+
<input type="text" class="form-control" id="inputPersonality5">
|
94 |
+
</div>
|
95 |
+
</div>
|
96 |
+
</div>
|
97 |
+
<button id="updatePersonality" class="btn btn-primary" type="button" data-bs-toggle="collapse" data-bs-target="#collapseExample" aria-expanded="false" aria-controls="collapseExample">
|
98 |
+
Update Personality
|
99 |
+
</button>
|
100 |
+
</form>
|
101 |
+
</div>
|
102 |
+
</div>
|
103 |
+
<div class="accordion-item">
|
104 |
+
<h2 class="accordion-header" id="headingThree">
|
105 |
+
<button class="accordion-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseThree" aria-expanded="false" aria-controls="collapseThree">
|
106 |
+
Parameters
|
107 |
+
</button>
|
108 |
+
</h2>
|
109 |
+
<div id="collapseThree" class="accordion-collapse collapse" aria-labelledby="headingThree" data-bs-parent="#accordionExample">
|
110 |
+
|
111 |
+
<div class="chat-parameter card card-body">
|
112 |
+
<div class="form-check">
|
113 |
+
<input class="form-check-input" type="checkbox" value="" id="doSample" checked>
|
114 |
+
<label class="form-check-label" for="doSample">
|
115 |
+
Do Sample
|
116 |
+
</label>
|
117 |
+
</div>
|
118 |
+
<label for="minLength" class="form-label">Minimal Length: <span id="minLengthValue">1</span></label>
|
119 |
+
<input type="range" class="form-range" min="1" max="10" value="1" id="minLength" onmousemove="updateValue('minLengthValue', this.value);">
|
120 |
+
<label for="maxLength" class="form-label">Maximal Length: <span id="maxLengthValue">20</span></label>
|
121 |
+
<input type="range" class="form-range" min="20" max="50" value="20" id="maxLength" onmousemove="updateValue('maxLengthValue', this.value);">
|
122 |
+
<label for="temperature" class="form-label">Temperature: <span id="temperatureValue">0.7</span></label>
|
123 |
+
<input type="range" class="form-range" min="0.5" max="10" value="0.7" step="0.1" id="temperature" onmousemove="updateValue('temperatureValue', this.value);">
|
124 |
+
<label for="topK" class="form-label">Top k: <span id="topKValue">0</span></label>
|
125 |
+
<input type="range" class="form-range" min="0" max="50" value="0" id="topK" onmousemove="updateValue('topKValue', this.value);">
|
126 |
+
<label for="topP" class="form-label">Top p: <span id="topPValue">0.9</span></label>
|
127 |
+
<input type="range" class="form-range" min="0.1" max="1.0" value="0.9" step="0.01" id="topP" onmousemove="updateValue('topPValue', this.value);">
|
128 |
+
</div>
|
129 |
+
</div>
|
130 |
+
</div>
|
131 |
+
</div>
|
132 |
+
</div>
|
133 |
+
</body>
|
134 |
+
</html>
|
app/css/main.css
ADDED
@@ -0,0 +1,109 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
body {
|
2 |
+
font-family: "Arial, Courier New", sans-serif;
|
3 |
+
text-align: center;
|
4 |
+
}
|
5 |
+
|
6 |
+
h1 {
|
7 |
+
margin: 10px 10px;
|
8 |
+
}
|
9 |
+
|
10 |
+
.buttons {
|
11 |
+
font-size: 2em;
|
12 |
+
display: flex;
|
13 |
+
justify-content: center;
|
14 |
+
}
|
15 |
+
|
16 |
+
.button, .value {
|
17 |
+
line-height: 1;
|
18 |
+
padding: 1rem;
|
19 |
+
margin: 1rem;
|
20 |
+
border: medium solid;
|
21 |
+
min-height: 1em;
|
22 |
+
min-width: 1em;
|
23 |
+
}
|
24 |
+
|
25 |
+
.button {
|
26 |
+
cursor: pointer;
|
27 |
+
user-select: none;
|
28 |
+
}
|
29 |
+
|
30 |
+
.minus {
|
31 |
+
color: red;
|
32 |
+
}
|
33 |
+
|
34 |
+
.plus {
|
35 |
+
color: green;
|
36 |
+
}
|
37 |
+
|
38 |
+
.value {
|
39 |
+
min-width: 2em;
|
40 |
+
}
|
41 |
+
|
42 |
+
.state {
|
43 |
+
font-size: 2em;
|
44 |
+
}
|
45 |
+
|
46 |
+
.container {
|
47 |
+
min-width: 30em;
|
48 |
+
max-width: 40em;
|
49 |
+
}
|
50 |
+
|
51 |
+
.accordion-collapse {
|
52 |
+
padding: 5px 5px 0 5px;
|
53 |
+
}
|
54 |
+
|
55 |
+
.chat-container {
|
56 |
+
margin: 10px 0;
|
57 |
+
min-height: 300px;
|
58 |
+
max-height: 600px;
|
59 |
+
overflow: auto;
|
60 |
+
}
|
61 |
+
|
62 |
+
.bot-personality {
|
63 |
+
text-align: left;
|
64 |
+
margin: 0 0 5px 0;
|
65 |
+
}
|
66 |
+
|
67 |
+
.chat-parameter {
|
68 |
+
text-align: left;
|
69 |
+
margin: 5px 0 5px 0;
|
70 |
+
}
|
71 |
+
|
72 |
+
.bot-personality input {
|
73 |
+
margin: 5px 0 0 0;
|
74 |
+
min-width: 20em;
|
75 |
+
}
|
76 |
+
|
77 |
+
.message {
|
78 |
+
margin: 5px 0;
|
79 |
+
}
|
80 |
+
|
81 |
+
.message-inner {
|
82 |
+
font-size: 16px;
|
83 |
+
}
|
84 |
+
|
85 |
+
.outgoing {
|
86 |
+
text-align: right;
|
87 |
+
}
|
88 |
+
|
89 |
+
.outgoing .badge {
|
90 |
+
text-align: right;
|
91 |
+
}
|
92 |
+
|
93 |
+
.botPersonality, .incoming, .incoming .badge, .chat-suggestion, .server-message, .parameters {
|
94 |
+
text-align: left;
|
95 |
+
}
|
96 |
+
|
97 |
+
.chat-suggestion, .server-message
|
98 |
+
{
|
99 |
+
padding-left: 5px;
|
100 |
+
}
|
101 |
+
|
102 |
+
.server-message-value {
|
103 |
+
font-style: italic;
|
104 |
+
}
|
105 |
+
|
106 |
+
#collapseParameter {
|
107 |
+
width: 300px;
|
108 |
+
margin: 8px 0px;
|
109 |
+
}
|
app/js/main.js
ADDED
@@ -0,0 +1,122 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
updateValue = function(id, value) {
|
2 |
+
document.getElementById(id).innerText = value;
|
3 |
+
}
|
4 |
+
|
5 |
+
htmlToElement = function(html) {
|
6 |
+
let template = document.createElement('template');
|
7 |
+
html = html.trim(); // Never return a text node of whitespace as the result
|
8 |
+
template.innerHTML = html;
|
9 |
+
return template.content.firstChild;
|
10 |
+
}
|
11 |
+
|
12 |
+
pageSetup = function() {
|
13 |
+
const minus = document.querySelector('.minus');
|
14 |
+
const plus = document.querySelector('.plus');
|
15 |
+
const value = document.querySelector('.value');
|
16 |
+
// const users = document.querySelector('.users');
|
17 |
+
const userInput = document.querySelector('.user-input');
|
18 |
+
const userInputButton = document.querySelector('.user-input-button');
|
19 |
+
const serverMessageValue = document.querySelector('.server-message-value');
|
20 |
+
const messages = document.querySelector('.messages');
|
21 |
+
const updatePersonality = document.getElementById("updatePersonality")
|
22 |
+
const websocket = new WebSocket("wss://gpt2-chat.ai-research.id/");
|
23 |
+
//const websocket = new WebSocket("ws://localhost:8502/");
|
24 |
+
|
25 |
+
minus.onclick = function () {
|
26 |
+
websocket.send(JSON.stringify({action: 'minus'}));
|
27 |
+
}
|
28 |
+
|
29 |
+
plus.onclick = function () {
|
30 |
+
websocket.send(JSON.stringify({action: 'plus'}));
|
31 |
+
}
|
32 |
+
|
33 |
+
updatePersonality.onclick = function () {
|
34 |
+
const elements = document.querySelectorAll(".bot-personality input")
|
35 |
+
let data = {
|
36 |
+
"action": "personality",
|
37 |
+
"message": []
|
38 |
+
}
|
39 |
+
for (let i = 0; i < Math.min(elements.length, 5); i++) {
|
40 |
+
if(elements[i].value.length >0)
|
41 |
+
data.message.push(elements[i].value);
|
42 |
+
}
|
43 |
+
websocket.send(JSON.stringify(data));
|
44 |
+
}
|
45 |
+
|
46 |
+
let getParameters = function() {
|
47 |
+
return {
|
48 |
+
"do_sample": document.getElementById("doSample").checked,
|
49 |
+
"min_length": parseInt(document.getElementById("minLength").value),
|
50 |
+
"max_length": parseInt(document.getElementById("maxLength").value),
|
51 |
+
"temperature": parseFloat(document.getElementById("temperature").value),
|
52 |
+
"top_k": parseInt(document.getElementById("topK").value),
|
53 |
+
"top_p": parseFloat(document.getElementById("topP").value),
|
54 |
+
};
|
55 |
+
}
|
56 |
+
|
57 |
+
let processUserInput = function (userInput) {
|
58 |
+
let parameters = getParameters();
|
59 |
+
parameters["action"] = "talk";
|
60 |
+
parameters["utterance"] = userInput.value;
|
61 |
+
websocket.send(JSON.stringify(parameters));
|
62 |
+
const element = htmlToElement("<div class=\"message outgoing\"><div class=\"message-inner badge bg-primary text-wrap\">"
|
63 |
+
+ userInput.value + "</div></div>");
|
64 |
+
userInput.value = "";
|
65 |
+
messages.appendChild(element);
|
66 |
+
messages.scrollIntoView(false)
|
67 |
+
}
|
68 |
+
|
69 |
+
userInputButton.onclick = function () {
|
70 |
+
processUserInput(userInput);
|
71 |
+
}
|
72 |
+
|
73 |
+
userInput.addEventListener("keyup", function(event) {
|
74 |
+
if (event.keyCode === 13) {
|
75 |
+
// Cancel the default action, if needed
|
76 |
+
event.preventDefault();
|
77 |
+
processUserInput(userInput);
|
78 |
+
}
|
79 |
+
});
|
80 |
+
|
81 |
+
websocket.onmessage = function (event) {
|
82 |
+
let data = JSON.parse(event.data);
|
83 |
+
switch (data.type) {
|
84 |
+
case 'connection':
|
85 |
+
console.log(data.value)
|
86 |
+
websocket.send(JSON.stringify({action: 'dialog', personality: []}));
|
87 |
+
break;
|
88 |
+
case 'state':
|
89 |
+
value.textContent = data.value;
|
90 |
+
break;
|
91 |
+
case 'users':
|
92 |
+
serverMessageValue.textContent = (
|
93 |
+
data.count.toString() + " user" +
|
94 |
+
(data.count === 1 ? "" : "s") + " online");
|
95 |
+
break;
|
96 |
+
case 'dialog':
|
97 |
+
console.log(data.message)
|
98 |
+
break;
|
99 |
+
case 'talk':
|
100 |
+
const element = htmlToElement("<div class=\"message incoming\"><div class=\"message-inner badge bg-success text-wrap\">"
|
101 |
+
+ data.message+ "</div></div>");
|
102 |
+
messages.appendChild(element);
|
103 |
+
messages.scrollIntoView(false)
|
104 |
+
break;
|
105 |
+
case 'personality':
|
106 |
+
const elements = document.querySelectorAll(".bot-personality input")
|
107 |
+
for (let i = 0; i < Math.min(elements.length, data.message.length); i++) {
|
108 |
+
elements[i].value = data.message[i];
|
109 |
+
}
|
110 |
+
break;
|
111 |
+
case 'personality_reply':
|
112 |
+
serverMessageValue.textContent = data.message
|
113 |
+
setTimeout(function() {
|
114 |
+
websocket.send(JSON.stringify({action: 'get_users'}));
|
115 |
+
}, 3000);
|
116 |
+
break;
|
117 |
+
default:
|
118 |
+
console.error(
|
119 |
+
"unsupported event", data);
|
120 |
+
}
|
121 |
+
};
|
122 |
+
}
|
app/prompts.py
ADDED
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
PROMPT_LIST = {
|
2 |
+
"Indonesian GPT-2": {
|
3 |
+
"Resep masakan (recipe)": [
|
4 |
+
"Berikut adalah cara memasak sate ayam:\n",
|
5 |
+
"Langkah-langkah membuat nasi goreng:\n",
|
6 |
+
"Berikut adalah bahan-bahan membuat nastar:\n"
|
7 |
+
],
|
8 |
+
"Puisi (poetry)": [
|
9 |
+
"Aku ingin jadi merpati\nTerbang di langit yang damai\nBernyanyi-nyanyi tentang masa depan\n",
|
10 |
+
"Terdiam aku satu persatu dengan tatapan binar\nSenyawa merasuk dalam sukma membuat lara\nKefanaan membentuk kelemahan"
|
11 |
+
],
|
12 |
+
"Cerpen (short story)": [
|
13 |
+
"Putri memakai sepatunya dengan malas. Kalau bisa, selama seminggu ini ia bolos sekolah saja. Namun, Mama pasti akan marah. Ulangan tengah semester telah selesai. Minggu ini, di sekolah sedang berlangsung pekan olahraga.",
|
14 |
+
"\"Wah, hari ini cerah sekali ya,\" ucap Budi ketika ia keluar rumah.",
|
15 |
+
"Sewindu sudah kita tak berjumpa, rinduku padamu sudah tak terkira."
|
16 |
+
],
|
17 |
+
"Sejarah (history)": [
|
18 |
+
"Mohammad Natsir adalah seorang ulama, politisi, dan pejuang kemerdekaan Indonesia.",
|
19 |
+
"Ir. H. Soekarno adalah Presiden pertama Republik Indonesia. Ia adalah seorang tokoh perjuangan yang memainkan peranan penting dalam memerdekakan bangsa Indonesia",
|
20 |
+
"Borobudur adalah sebuah candi Buddha yang terletak di sebelah barat laut Yogyakarta. Monumen ini merupakan model alam semesta dan dibangun sebagai tempat suci untuk memuliakan Buddha"
|
21 |
+
],
|
22 |
+
},
|
23 |
+
"Indonesian Literature": {
|
24 |
+
"Adult Romance": [
|
25 |
+
"Ini adalah kisah tentang seorang laki-laki yang berusaha memperjuangkan cintanya",
|
26 |
+
"Alunan musik terdengar memenuhi ruangan kantor, cowok itu duduk di balik meja kerjanya sambil memejamkan mata. Berusaha meresapi nada per nada",
|
27 |
+
"Aku mencari dan terus mencari\nDimana bahagia akan kutemui\nKumencari terus mencari\nHingga ku tak mengerti arti hari-hari",
|
28 |
+
"Gadis itu mengharuskan dirinya tegar, dan kuat dalam menghadapi masalah. Menahan air matanya jatuh setiap kali ingin menangis"
|
29 |
+
],
|
30 |
+
"Horror": [
|
31 |
+
"Ditengah-tengah perbincangan mereka berdua, datanglah sesosok mahluk tinggi hitam dan besar",
|
32 |
+
"Sesosok hantu perempuan seperti kuntilanak yang melayang keluar dan bergerak perlahan dari pintu kamar kecil tadi yang tertutup.",
|
33 |
+
"Sejak pertemuannya dengan leak, yang ternyata tinggal satu atap dengannya, hidupnya terus dihantui oleh berbagai sosok seram."
|
34 |
+
],
|
35 |
+
"Poetry": [
|
36 |
+
"Aku ingin menulis sajak\nyang melesat dalam kejap\nmenembus hati yang pejam\nmemaksa mimpimu terjaga\ndari semu",
|
37 |
+
"Malam ini langitku lengang\ntiada hujan yang membasuh rindu\npun awan yang biasanya temani seruput kopimu",
|
38 |
+
"Di sisimu waktu menjelma\nsetangkai kembang api\ngelora membakar tanpa jeda\nmemercik pijar binar kita."
|
39 |
+
]
|
40 |
+
},
|
41 |
+
"Indonesian Journal": {
|
42 |
+
"Biologi (biology)": [
|
43 |
+
"Tujuan penelitian ini untuk menentukan keanekaragaman Arthropoda pada lahan pertanian kacang",
|
44 |
+
"Identifikasi spesies secara molekuler sangat diperlukan dalam mempelajari taksonomi",
|
45 |
+
"Penelitian ini bertujuan untuk menentukan identitas invertebrata laut dari Perairan Papua dengan teknik DNA barcoding"],
|
46 |
+
"Psikologi (psychology)": [
|
47 |
+
"Penelitian ini bertujuan untuk mengetahui perilaku wirausaha remaja yang diprediksi dari motivasi intrinsik",
|
48 |
+
"Tujuan dari penelitian ini adalah untuk mendapatkan data empiris mengenai gambaran peta bakat mahasiswa Fakultas Psikologi Unjani"],
|
49 |
+
"Ekonomi (economics)": [
|
50 |
+
"Faktor kepuasan dan kepercayaan konsumen merupakan dua faktor kunci dalam meningkatkan penetrasi e-commerce. Penelitian yang dilakukan",
|
51 |
+
"Penelitian ini bertujuan untuk menganalisis pola konsumsi pangan di Indonesia",
|
52 |
+
"Model GTAP diimplementasikan untuk melihat dampak yang ditimbulkan pada PDB"],
|
53 |
+
"Teknologi Informasi (IT)": [
|
54 |
+
"pembuatan aplikasi ini menggunakan pengembangan metode Waterfall dan dirancang mengguynakan Unified Modeling Language (UML) dengan bahasa pemrograman",
|
55 |
+
"Berdasarkan masalah tersebut, maka penulis termotivasi untuk membangun Pengembangan Sistem Informasi Manajemen"]
|
56 |
+
},
|
57 |
+
}
|
requirements.txt
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
numpy
|
2 |
+
torch
|
3 |
+
tokenizers
|
4 |
+
transformers
|
5 |
+
datasets
|
6 |
+
mtranslate
|
7 |
+
# streamlit version 0.67.1 is needed due to issue with caching
|
8 |
+
# streamlit==0.67.1
|
9 |
+
streamlit
|
10 |
+
psutil
|