tpeng726 commited on
Commit
b156a3e
·
verified ·
1 Parent(s): d5aa9c0

Update title consistency

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,7 +9,7 @@ license: llama3
9
  ---
10
  <a href="https://www.gradient.ai" target="_blank"><img src="https://cdn-uploads.huggingface.co/production/uploads/655bb613e8a8971e89944f3e/TSa3V8YpoVagnTYgxiLaO.png" width="200"/></a>
11
 
12
- # Llama-3 8B Gradient Instruct 4194K (Work-in-progress)
13
 
14
  Join our custom agent and long context (262k-1M+) waitlist: https://forms.gle/L6TDY7dozx8TuoUv7
15
 
@@ -44,7 +44,7 @@ For training data, we generate long contexts by augmenting [SlimPajama](https://
44
  |------------------------|-----------|-----------|-----------|-----------|-----------|
45
  | Initialize From | LLaMA-3 8B| 65K | 262K | 524k | 1048k |
46
  | Sequence Length 2^N | 16 | 18 | 19 | 20 | 22 |
47
- | RoPE theta | 15.3 M | 207.1 M | 1.06B | 2.80B | 45.2B |
48
  | Batch Size | 1 | 1 | 16 | 8 | 2 |
49
  | Gradient Accumulation Steps | 32 | 16 | 1 | 1 | 2 |
50
  | Steps | 30 | 24 | 50 | 50 | 12 (stopped early) |
 
9
  ---
10
  <a href="https://www.gradient.ai" target="_blank"><img src="https://cdn-uploads.huggingface.co/production/uploads/655bb613e8a8971e89944f3e/TSa3V8YpoVagnTYgxiLaO.png" width="200"/></a>
11
 
12
+ # Llama-3 8B Instruct Gradient 4194K (Work-in-progress)
13
 
14
  Join our custom agent and long context (262k-1M+) waitlist: https://forms.gle/L6TDY7dozx8TuoUv7
15
 
 
44
  |------------------------|-----------|-----------|-----------|-----------|-----------|
45
  | Initialize From | LLaMA-3 8B| 65K | 262K | 524k | 1048k |
46
  | Sequence Length 2^N | 16 | 18 | 19 | 20 | 22 |
47
+ | RoPE Theta | 15.3 M | 207.1 M | 1.06B | 2.80B | 45.2B |
48
  | Batch Size | 1 | 1 | 16 | 8 | 2 |
49
  | Gradient Accumulation Steps | 32 | 16 | 1 | 1 | 2 |
50
  | Steps | 30 | 24 | 50 | 50 | 12 (stopped early) |