Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ This model represents a significant stride in LLM research, specifically address
|
|
22 |
- **Context Window Size**: 1024 tokens
|
23 |
|
24 |
## Training
|
25 |
-
- **Dataset**:
|
26 |
- **Data Size**: 23 GB
|
27 |
- **Tokenizer**: Aranizer 64K
|
28 |
- **Tokens**: Over 3.3 billion
|
|
|
22 |
- **Context Window Size**: 1024 tokens
|
23 |
|
24 |
## Training
|
25 |
+
- **Dataset**: Scraped texts contains scientific articles, and general texts
|
26 |
- **Data Size**: 23 GB
|
27 |
- **Tokenizer**: Aranizer 64K
|
28 |
- **Tokens**: Over 3.3 billion
|