tpeng726 commited on
Commit
5e0d761
1 Parent(s): 7484062

Make some quick consistency fixes to model card

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -9,7 +9,7 @@ license: llama3
9
  ---
10
  <a href="https://www.gradient.ai" target="_blank"><img src="https://cdn-uploads.huggingface.co/production/uploads/655bb613e8a8971e89944f3e/TSa3V8YpoVagnTYgxiLaO.png" width="200"/></a>
11
 
12
- # Llama-3 70B Gradient Instruct 262K
13
 
14
  Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. If you're looking to build custom AI models or agents, email us a message [email protected].
15
 
@@ -40,14 +40,14 @@ For training data, we generate long contexts by augmenting [SlimPajama](https://
40
  | Initialize From | 65K | 262K |
41
  |-------------------------|----------------------|------------|
42
  | Sequence Length 2^N | 16 | 18 |
43
- | RoPE theta | 15,296,098 | 207,112,184|
44
  | Batch Size | 1 | 1 |
45
  | Gradient Accumulation Steps | 1 | 1 |
46
  | Steps | 20 | 25 |
47
  | Total Tokens | 83,886,080 | 104,857,600|
48
- | Learning rate | 0.00002 | 0.00002 |
49
  | # GPUs | 512 | 512 |
50
- | Ring parallelism | 64 | 16 |
51
  | GPU Type | NVIDIA L40S | NVIDIA L40S|
52
  | Minutes to Train (Wall) | 100 | 170 |
53
 
 
9
  ---
10
  <a href="https://www.gradient.ai" target="_blank"><img src="https://cdn-uploads.huggingface.co/production/uploads/655bb613e8a8971e89944f3e/TSa3V8YpoVagnTYgxiLaO.png" width="200"/></a>
11
 
12
+ # Llama-3 70B Instruct Gradient 262K
13
 
14
  Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. If you're looking to build custom AI models or agents, email us a message [email protected].
15
 
 
40
  | Initialize From | 65K | 262K |
41
  |-------------------------|----------------------|------------|
42
  | Sequence Length 2^N | 16 | 18 |
43
+ | RoPE Theta | 15,296,098 | 207,112,184|
44
  | Batch Size | 1 | 1 |
45
  | Gradient Accumulation Steps | 1 | 1 |
46
  | Steps | 20 | 25 |
47
  | Total Tokens | 83,886,080 | 104,857,600|
48
+ | Learning Rate | 0.00002 | 0.00002 |
49
  | # GPUs | 512 | 512 |
50
+ | Ring Parallelism | 64 | 16 |
51
  | GPU Type | NVIDIA L40S | NVIDIA L40S|
52
  | Minutes to Train (Wall) | 100 | 170 |
53