prithivMLmods commited on
Commit
f74da0d
·
verified ·
1 Parent(s): 51b854b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +65 -1
README.md CHANGED
@@ -17,4 +17,68 @@ tags:
17
  - Ollama
18
  - v.1
19
  - text-generation-inference
20
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  - Ollama
18
  - v.1
19
  - text-generation-inference
20
+ ---
21
+
22
+ # **Llama-Express.1**
23
+
24
+ Llama-Express.1 is a 1B model based on Llama 3.2 (1B), fine-tuned on long chain-of-thought datasets. This instruction-tuned, text-only model is optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. It outperforms many of the available open-source and closed chat models.
25
+
26
+ # **Use with transformers**
27
+
28
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
29
+
30
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
31
+
32
+ ```python
33
+ import torch
34
+ from transformers import pipeline
35
+
36
+ model_id = "prithivMLmods/Llama-Express.1"
37
+ pipe = pipeline(
38
+ "text-generation",
39
+ model=model_id,
40
+ torch_dtype=torch.bfloat16,
41
+ device_map="auto",
42
+ )
43
+ messages = [
44
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
45
+ {"role": "user", "content": "Who are you?"},
46
+ ]
47
+ outputs = pipe(
48
+ messages,
49
+ max_new_tokens=256,
50
+ )
51
+ print(outputs[0]["generated_text"][-1])
52
+ ```
53
+
54
+ # **Intended Use**
55
+ 1. **Multilingual Dialogue**:
56
+ - Designed for high-quality, multilingual conversations, making it suitable for applications requiring natural, fluid dialogue across languages.
57
+
58
+ 2. **Agentic Retrieval**:
59
+ - Optimized for retrieval-based tasks where reasoning and contextual chaining are crucial for extracting and summarizing relevant information.
60
+
61
+ 3. **Summarization Tasks**:
62
+ - Effective in generating concise and accurate summaries from complex and lengthy texts, suitable for academic, professional, and casual use cases.
63
+
64
+ 4. **Instruction-Following Applications**:
65
+ - Fine-tuned for tasks requiring adherence to user-provided instructions, making it ideal for automation workflows, content creation, and virtual assistant integrations.
66
+
67
+ # **Limitations**
68
+ 1. **Monomodal Focus**:
69
+ - As a text-only model, it cannot process multimodal inputs like images, audio, or videos, limiting its versatility in multimedia applications.
70
+
71
+ 2. **Context Length Constraints**:
72
+ - While optimized for long chain-of-thought reasoning, extreme cases with very large contexts may still lead to degraded performance or truncation issues.
73
+
74
+ 3. **Bias and Ethics**:
75
+ - The model might reflect biases present in the training datasets, potentially resulting in outputs that could be culturally insensitive or inappropriate.
76
+
77
+ 4. **Performance in Low-Resource Languages**:
78
+ - While multilingual, its effectiveness may vary across languages, with possible performance drops in underrepresented or low-resource languages.
79
+
80
+ 5. **Dependency on Input Quality**:
81
+ - The model's output is heavily influenced by the clarity and specificity of the input instructions. Ambiguous or vague prompts may lead to suboptimal results.
82
+
83
+ 6. **Lack of Real-Time Internet Access**:
84
+ - Without real-time retrieval capabilities, it cannot provide up-to-date information or verify facts against the latest data.