Commit
路
be83a0d
1
Parent(s):
c7e7a9d
Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,12 @@ license: apache-2.0
|
|
9 |
## What's baizemocracy-lora-7B-cfqa model?
|
10 |
|
11 |
This model is an open-source chat model fine-tuned with [LoRA](https://github.com/microsoft/LoRA) inspired by [Baize project](https://github.com/project-baize/baize-chatbot/tree/main/). It was trained with the Baize datasets and the ask2democracy-cfqa-salud-pension dataset, wich contains almost 4k instructions to answers questions based on a context relevant to citizen concerns and public debate in spanish.
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
- **Developed by:**
|
14 |
- 馃嚚馃嚧 [Jorge Henao](https://huggingface.co/jorge-henao)
|
@@ -33,4 +39,24 @@ This model is an open-source chat model fine-tuned with [LoRA](https://github.co
|
|
33 |
- [Alpacaca chat Dialogs](https://github.com/project-baize/baize)
|
34 |
- [Medical chat Dialogs](https://github.com/project-baize/baize)
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
36 |
More details can be found in the Ask2Democracy [GitHub](https://github.com/jorge-henao/ask2democracy)
|
|
|
9 |
## What's baizemocracy-lora-7B-cfqa model?
|
10 |
|
11 |
This model is an open-source chat model fine-tuned with [LoRA](https://github.com/microsoft/LoRA) inspired by [Baize project](https://github.com/project-baize/baize-chatbot/tree/main/). It was trained with the Baize datasets and the ask2democracy-cfqa-salud-pension dataset, wich contains almost 4k instructions to answers questions based on a context relevant to citizen concerns and public debate in spanish.
|
12 |
+
Two major experiments models was performed during the Hackathon Somos NLP 2023: A conversational style focused model and a contex focused style model.
|
13 |
+
This model is focused in a more conversational way of asking questions. See Pre-proccessing dataset section.
|
14 |
+
There is other model variation more focused on augmented retrieval based on context [Baizemocracy-contextfocused](https://github.com/project-baize/baize-chatbot/tree/main/).
|
15 |
+
|
16 |
+
Testing is a work in progress, we decide to share both model variations with community in order to invovle more people experimenting what it works better and find other possible use cases.
|
17 |
+
|
18 |
|
19 |
- **Developed by:**
|
20 |
- 馃嚚馃嚧 [Jorge Henao](https://huggingface.co/jorge-henao)
|
|
|
39 |
- [Alpacaca chat Dialogs](https://github.com/project-baize/baize)
|
40 |
- [Medical chat Dialogs](https://github.com/project-baize/baize)
|
41 |
|
42 |
+
- ### About pre-processing
|
43 |
+
|
44 |
+
<code>
|
45 |
+
def format_instruction_without_context(example):
|
46 |
+
example["topic"] = example['input']
|
47 |
+
input = "La conversaci贸n entre un humano y un asistente de IA."
|
48 |
+
input += "\n[|Human|] "+example['input']
|
49 |
+
input += "\n[|AI|] "+example["output"]
|
50 |
+
if len(example["topics"])>0:
|
51 |
+
topics = ", ".join(example["topics"])
|
52 |
+
input += "\n[|Human|] "+"驴En cu谩les t贸picos clasificar铆as su respuesta?"
|
53 |
+
input += "\n[|AI|] "+f"Aqu铆 una lista de t贸picos: {topics}."
|
54 |
+
example["topic"] += f" ({topics})"
|
55 |
+
example["input"] = input
|
56 |
+
return example
|
57 |
+
data_reforma_salud_cfqa_without_context = data_reforma_salud_cfqa.map(format_instruction_without_context, remove_columns=['output','topics','instruction'])
|
58 |
+
data_reforma_salud_cqa_withou
|
59 |
+
</code>
|
60 |
+
|
61 |
+
|
62 |
More details can be found in the Ask2Democracy [GitHub](https://github.com/jorge-henao/ask2democracy)
|