File size: 7,445 Bytes
b9626f8
88d4132
6cdfe69
 
7fbbf54
8a199a7
c5e558a
7fbbf54
b9626f8
7fbbf54
8a199a7
c5e558a
8657f9a
c5e558a
8a199a7
 
041c2f1
169463c
52ab766
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7f2bcbe
 
 
 
52ab766
 
 
 
 
 
 
88d4132
52ab766
 
 
 
 
 
 
4206bf1
52ab766
 
 
 
 
 
4206bf1
52ab766
 
 
 
 
 
 
 
 
 
 
88d4132
52ab766
88d4132
52ab766
6ae470a
 
 
 
 
 
 
 
 
 
 
 
 
 
52ab766
 
 
 
dd66d6a
 
8a199a7
ca1775c
2402bf2
8a199a7
 
dd66d6a
85d72f8
 
88d4132
 
 
 
7269076
 
88d4132
7269076
88d4132
 
 
 
 
 
 
 
 
 
 
 
7269076
 
88d4132
 
 
7269076
 
88d4132
 
7269076
 
88d4132
7269076
 
 
 
88d4132
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
license: apache-2.0
tags:
- jamba
datasets:
- Locutusque/hercules-v4.0
base_model: ai21labs/Jamba-v0.1
pipeline_tag: text-generation
---

# Jamba-Hercules

<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/tIpjF0sb9Bqo4TiZ9Z5Up.webp" width="500" height="500">

## Datset used: Locutusque/hercules-v4.0
*- First 10k Examples*
---

## Example Output: 

```
<|startoftext|><|im_start|>system 
You are a helpful AI assistant.
<|im_end|> 
<|im_start|>user
A tree needs to leave the earth and become astro-biological. How can it achieve it's goal?
<|im_end|> 
<|im_start|>assistant
The process of becoming astronautic life for a tree is an excemtally complicated one that requires a great deal of preparation, coordination, and funding. The first step in this process is for the tree to be identified as a potential candidate for astronautics, which may involve a thorough assessment of its size, health, and suitability for space travel.
Once a tree has been identified as a possible astronautic life-form, the next step involves preparing it for spaceflight. This includes training the tree to tolerate low gravity environments, teaching it how to navigate in zero gravity, and ensuring that its nutritional needs are met in space. The process of acclimating a tree to spaceflight can take months or even years, depending on the species and the specific requirements of each mission.
Once the tree has been adequately trained and prepared for spaceflight, it must then be transported into orbit or beyond. This may involve launching the tree aboard a rocket or shuttle, or using other methods such as balloons or parachutes to reach its destination. The journey itself can be long and arduous, with many potential risks and challenges along the way.
Upon arrival in space, the tree will need to be monitored closely and cared for by astronautic personnel. This may involve feeding the tree, maintaining its health and well-being, and ensuring that it has enough water and nutrients to survive. The tree may also need to be protected from radiation, extreme temperatures, or other hazards that it may face in space.
Over time, as the tree adapts to life in space, it will likely undergo some changes in its physical appearance and behavior. For example, it may begin growing in new ways, developing different kinds of leaves or branches in response to the unique environment it finds itself in. It may also become more active and mobile, using its roots to navigate and explore its new home.
Throughout this process, there are many challenges that need to be addressed and overcome.<|im_end|>
<|endoftext|>
```

# Inference

*※ - This code automatically appends the "<|startoftext|>" special token to any input. Appending this to all inputs is required for inference, as initial testing shows that leaving it out leads to output errors.*

```py

!pip install -qqq transformers>=4.39.0 mamba-ssm causal-conv1d>=1.2.0 accelerate bitsandbytes --progress-bar off
!pip install flash-attn --no-build-isolation

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

double_quant_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.float16
)

model = AutoModelForCausalLM.from_pretrained(
    "Severian/Jamba-Hercules",
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    quantization_config=double_quant_config,
)
tokenizer = AutoTokenizer.from_pretrained("Severian/Jamba-Hercules")

input_text = """<|im_start|>system 
You are a helpful AI assistant.
<|im_end|> 
<|im_start|>user
A tree needs to leave the earth and become astro-biological. How can it achieve it's goal?
<|im_end|> 
<|im_start|>assistant
"""

input_ids = tokenizer(input_text, return_tensors='pt').to(model.device)["input_ids"]

outputs = model.generate(input_ids, max_new_tokens=1024, temperature=0.0, repetition_penalty=1.1)

print(tokenizer.batch_decode(outputs)[0])
# <|startoftext|><|im_start|>system 
# You are a helpful AI assistant.
# <|im_end|> 
# <|im_start|>user
# A tree needs to leave the earth and become astro-biological. How can it achieve it's goal?
# <|im_end|> 
# <|im_start|>assistant
# The process of becoming astronautic life for a tree is an excemtally complicated one that requires a great deal of preparation, coordination, and funding. The first step in this process is for the tree to be identified as a potential candidate for astronautics, which may involve a thorough assessment of its size, health, and suitability for space travel.
# Once a tree has been identified as a possible astronautic life-form, the next step involves preparing it for spaceflight. This includes training the tree to tolerate low gravity environments, teaching it how to navigate in zero gravity, and ensuring that its nutritional needs are met in space. The process of acclimating a tree to spaceflight can take months or even years, depending on the species and the specific requirements of each mission.
# Once the tree has been adequately trained and prepared for spaceflight, it must then be transported into orbit or beyond. This may involve launching the tree aboard a rocket or shuttle, or using other methods such as balloons or parachutes to reach its destination. The journey itself can be long and arduous, with many potential risks and challenges along the way.
# Upon arrival in space, the tree will need to be monitored closely and cared for by astronautic personnel. This may involve feeding the tree, maintaining its health and well-being, and ensuring that it has enough water and nutrients to survive. The tree may also need to be protected from radiation, extreme temperatures, or other hazards that it may face in space.
# Over time, as the tree adapts to life in space, it will likely undergo some changes in its physical appearance and behavior. For example, it may begin growing in new ways, developing different kinds of leaves or branches in response to the unique environment it finds itself in. It may also become more active and mobile, using its roots to navigate and explore its new home.
# Throughout this process, there are many challenges that need to be addressed and overcome.<|im_end|>
# <|endoftext|>
```

---
## Training


### **Hercules-v4.0:** 

**FIRST TEST:**
- *1250 Steps (5 hours x A100)* 
- *Final Loss: 0.98*


### Hyperparameters

```py

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["embed_tokens", "x_proj", "in_proj", "out_proj"],
    lora_dropout=0.05,
    task_type="CAUSAL_LM",
    bias="none"
)

trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    tokenizer=tokenizer,
    args=TrainingArguments(
        num_train_epochs=1,
        lr_scheduler_type='cosine',
        learning_rate=0.0002,
        per_device_train_batch_size=1,
        gradient_accumulation_steps=8,
        gradient_checkpointing=True,
        warmup_steps=10,
        weight_decay=0.01,
        fp16=not torch.cuda.is_bf16_supported(),
        bf16=torch.cuda.is_bf16_supported(),
        logging_steps=1,
        save_steps=200,
        output_dir="outputs",
        optim="adamw_bnb_8bit",
        adam_epsilon=0.00001,
        adam_beta2=0.95,
        max_grad_norm=1.0,
        seed=42,
    ),
)

```