Text Generation
Transformers
Safetensors
English
olmoe
Mixture of Experts
olmo
conversational
Inference Endpoints
Muennighoff commited on
Commit
c40b034
1 Parent(s): b7e57d5

Change to nolbl other ckpt

Browse files
config.json CHANGED
@@ -22,7 +22,7 @@
22
  "pad_token_id": 1,
23
  "rope_scaling": null,
24
  "rope_theta": 10000.0,
25
- "router_aux_loss_coef": 0.001,
26
  "tie_word_embeddings": false,
27
  "torch_dtype": "bfloat16",
28
  "transformers_version": "4.44.0.dev0",
 
22
  "pad_token_id": 1,
23
  "rope_scaling": null,
24
  "rope_theta": 10000.0,
25
+ "router_aux_loss_coef": 0.01,
26
  "tie_word_embeddings": false,
27
  "torch_dtype": "bfloat16",
28
  "transformers_version": "4.44.0.dev0",
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a22526570f5ee560ab248a0add527f7f05b7adfa4b220677aa31df58c11ab811
3
  size 4997744872
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c79ac0487c3f23e8ee3d38752197d1e4a1a39d6c1438ec5fd7862874bb19321
3
  size 4997744872
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e8780024da448cf65ddeb830078c7734dae0af5c590be68f9b0195d2f84eb6a3
3
  size 4997235176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b58ca7a9f28f35d76b56b0ec40fc46c356fd84ccf078083b251a3ad6a2da9a35
3
  size 4997235176
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:710855707b0e45c6a06ad92b5fd2964873e4dd0db465fefd8d36a9c5de114c9e
3
  size 3843741912
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e93653ed509223e63eabae6f72ac8ea5a41115c5c7785574b9b2068ca0961c45
3
  size 3843741912