Goekdeniz-Guelmez commited on
Commit
bba69c1
1 Parent(s): 3f944c0

Upload folder using huggingface_hub (#1)

Browse files

- 9707882e31d42174567264f5c86636e980198b3c21e95056bafa6d22136e94c6 (a49df69b5b760550f0a8694c361ddec6e74f95a4)
- 24cbf460ceb5788f84917215212a0b000e5bae45abe73908cdb85d9d9e7baeff (51a47b0f296a992e410d87b87cc6301d7de2db20)

Files changed (5) hide show
  1. README.md +39 -0
  2. config.json +23 -0
  3. model.pth +3 -0
  4. tokenizer.json +0 -0
  5. tokenizer.model +3 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Goekdeniz-Guelmez/j.o.s.i.e.v4o-7b-orpo-stage1-v1
3
+ language:
4
+ - en
5
+ license: apache-2.0
6
+ tags:
7
+ - text-generation-inference
8
+ - transformers
9
+ - unsloth
10
+ - qwen2
11
+ - trl
12
+ - orpo
13
+ - KANama
14
+ ---
15
+
16
+ # Goekdeniz-Guelmez/KANama-fineweb-v1-test1
17
+
18
+ The Model [Goekdeniz-Guelmez/KANama-fineweb-v1-test1](https://huggingface.co/Goekdeniz-Guelmez/KANama-fineweb-v1-test1) was created using KANama.
19
+
20
+ ## Use with KANama
21
+
22
+ ```bash
23
+ pip install KANama, transformers
24
+ ```
25
+
26
+ ```python
27
+ from model.handler import from_pretrained, quick_inference
28
+ from transformers import AutoTokenizer
29
+
30
+ tokenizer = AutoTokenizer.from_pretrained("Doctor-Shotgun/TinyLlama-1.1B-32k")
31
+ model = from_pretrained("path/to/model/folder")
32
+
33
+ prompt="hello"
34
+
35
+ input_tokens = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
36
+
37
+ generated_tokens, generated_text = quick_inference(model, input_tokens, max_new_tokens=50, tokenizer=tokenizer)
38
+ print(generated_text)
39
+ ```
config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 152064,
3
+ "pad_id": 151645,
4
+ "eos_id": -1,
5
+ "dim": 256,
6
+ "n_layers": 18,
7
+ "n_heads": 12,
8
+ "n_kv_heads": 6,
9
+ "use_kan": true,
10
+ "train_softmax_temp": true,
11
+ "use_softmax_temp_proj": true,
12
+ "softmax_bias": false,
13
+ "multiple_of": 256,
14
+ "ffn_dim_multiplier": null,
15
+ "rms_norm_eps": 1e-05,
16
+ "rope_theta": 500000,
17
+ "use_scaled_rope": false,
18
+ "max_batch_size": 100,
19
+ "max_seq_len": 128,
20
+ "num_experts": 14,
21
+ "num_experts_per_tok": 4,
22
+ "model_type": "KANaMoEv1"
23
+ }
model.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a0f11cc63c408e1dabc2c4cf77bb9cad2fa575b26da401a4f22c666a495c544
3
+ size 8504870397
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a8506e7111b80c6d8635951a02eab0f4e1a8e4e5772da83846579e97b16f61bf
3
+ size 7031673