cosimoiaia commited on
Commit
7f348f7
·
1 Parent(s): 9f188ff

Create Readme-ITA.md

Browse files
Files changed (1) hide show
  1. Readme-ITA.md +119 -0
Readme-ITA.md ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - cosimoiaia/Loquace-102k
5
+ language:
6
+ - it
7
+ tags:
8
+ - Italian
9
+ - Qlora
10
+ - Mistral
11
+ - finetuning
12
+ ---
13
+ Model Card per Loquace-7B-Mistral
14
+
15
+ # 🇮🇹 Loquace-7B-Mistral v0.1 🇮🇹
16
+
17
+ Loquace is an Italian speaking, instruction finetuned, Large Language model. 🇮🇹
18
+
19
+ Loquace-7B-Mistral's peculiar features:
20
+
21
+ - Is pretty good a following istruction in italian.
22
+ - Responds well to prompt-engineering.
23
+ - Works well in a RAG (Retrival Augmented Generation) setup.
24
+ - It has been trained on a relatively raw dataset [Loquace-102K](https://huggingface.co/datasets/cosimoiaia/Loquace-102k) using QLoRa and Mistral-7B-Instruct as base.
25
+ - Training took only 4 hours on a 3090, costing a little more than <b>1 euro</b>! On [Genesis Cloud](https://gnsiscld.co/26qhlf) GPU.
26
+ - It is <b><i>Truly Open Source</i></b>: Model, Dataset and Code to replicate the results are completely released.
27
+ - Created in a garage in the south of Italy.
28
+
29
+ The Loquace Italian LLM models are created with the goal of democratizing AI and LLM in the Italian Landscape.
30
+
31
+ <b>No more need for expensive GPU, large funding, Big Corporation or Ivory Tower Institution, just download the code and train on your dataset on your own PC (or a cheap and reliable cloud provider like [Genesis Cloud](https://gnsiscld.co/26qhlf) )</b>
32
+
33
+ ### Fine-tuning Instructions:
34
+ The related code can be found at:
35
+ https://github.com/cosimoiaia/Loquace
36
+
37
+ ### GGUF Version for CPU Inference:
38
+ 8bit quantized Version of Loquace can be found [here](https://huggingface.co/cosimoiaia/Loquace-7B-Mistral-GGUF)
39
+
40
+ Here is an incomplate list of clients and libraries that are known to support GGUF (thanks to [TheBloke](https://huggingface.co/TheBloke) for this list and his awesome work) ):
41
+
42
+ * [llama.cpp](https://github.com/ggerganov/llama.cpp). The source project for GGUF. Offers a CLI and a server option.
43
+ * [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
44
+ * [KoboldCpp](https://github.com/LostRuins/koboldcpp), a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
45
+ * [LM Studio](https://lmstudio.ai/), an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration.
46
+ * [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
47
+ * [Faraday.dev](https://faraday.dev/), an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
48
+ * [ctransformers](https://github.com/marella/ctransformers), a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server.
49
+ * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
50
+ * [candle](https://github.com/huggingface/candle), a Rust ML framework with a focus on performance, including GPU support, and ease of use.
51
+
52
+ #### Previous releases of the Loquace family:
53
+
54
+ The Loquace family started in the beginning of 2023 to prove it was possible to fine-tune a Large Language model in a different language, you can find the other members of the family here:
55
+
56
+ - https://huggingface.co/cosimoiaia/Loquace-70m - Based on pythia-70m
57
+ - https://huggingface.co/cosimoiaia/Loquace-410m - Based on pythia-410m
58
+ - https://huggingface.co/cosimoiaia/Loquace-7B - Based on Falcon-7B
59
+ - https://huggingface.co/cosimoiaia/Loquace-12B - Based on pythia-12B
60
+ - https://huggingface.co/cosimoiaia/Loquace-20B - Based on gpt-neox-20B
61
+
62
+
63
+ These model, although not producing good results, are proof that cheap finetuning is actually possible in a short amount of time, with a relaively raw dataset and extremely low resources.
64
+
65
+ ## Inference:
66
+
67
+ ```python
68
+ from transformers import MistralForCausalLM, AutoTokenizer
69
+
70
+
71
+ def generate_prompt(instruction):
72
+ prompt = f"""### Instruction: {instruction}
73
+
74
+ ### Response:
75
+ """
76
+ return prompt
77
+
78
+ model_name = "."
79
+
80
+ model = MistralForCausalLM.from_pretrained(
81
+ model_name,
82
+ device_map="auto",
83
+ torch_dtype=torch.bfloat16
84
+ )
85
+
86
+ model.config.use_cache = True
87
+
88
+
89
+ tokenizer = AutoTokenizer.from_pretrained(model_name, add_eos_token=False)
90
+
91
+ prompt = generate_prompt("Chi era Dante Alighieri?")
92
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
93
+
94
+ outputs = model.generate(**inputs, do_sample = True, num_beams = 2, top_k=50, top_p= 0.95, max_new_tokens=2046, early_stopping = True)
95
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("Response:")[1].strip())
96
+ ```
97
+
98
+ ## Examples:
99
+ ```
100
+ > Pretendi di essere Dante Alighieri e raccontami la tua vita.
101
+ Sono Dante Alighieri, un poeta e scrittore italiano del XIV secolo. Nacqui a Firenze nel 1265 e ho trascorso gran parte della mia vita viaggiando in Europa e scrivendo poesie. Ho anche scritto la Divina Commedia, una delle opere letterarie più famose di tutti i tempi.
102
+
103
+ > Quanto fa 7 x 14 - 98 ?
104
+ 7 x 14 = 98, quindi la risposta è 0.
105
+
106
+ > La radice quadrata di 144 ?
107
+ La radice quadrata di 144 è 12.
108
+
109
+ > Inventa una ricetta pugliese con la polenta.
110
+ La ricetta è chiamata "Polenta al burro di arachidi e formaggio". Per prepararla, inizia bollendo la polenta secondo le istruzioni sulla confezione. Mentre la polenta si cucina, mescola insieme 1/2 tazza di burro di arachidi, 1/4 di tazza di formaggio grattugiato e 1/4 di tazza di parmigiano grattugiato. Una volta che la polenta è cotta, versala su una teglia da forno e distribuire il composto di burro di arachidi e formaggio sopra la polenta. Metti in forno a 350 gradi per 15 minuti o fino a quando la parte superiore è dorata. Servi caldo con un'insalata di pomodoro e basilico fresco.
111
+ ```
112
+
113
+ ## Limitations
114
+
115
+ - Loquace-12B may not handle complex or nuanced queries well and may struggle with ambiguous or poorly formatted inputs.
116
+ - The model may generate responses that are factually incorrect or nonsensical. It should be used with caution, and outputs should be carefully verified.
117
+
118
+ ## Model Author:
119
+ Cosimo Iaia <[email protected]>