temporary0-0name
/

orator

Model card Files Files and versions Community

temporary0-0name commited on Aug 7, 2024

Commit

a9d7cf6

·

verified ·

1 Parent(s): 2535913

Update README.md

Files changed (1) hide show

README.md +42 -3

README.md CHANGED Viewed

@@ -1,3 +1,42 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- hi
+- en
+metrics:
+- perplexity
+widget:
+- text: >-
+    It is raining and my family
+  example_title: Example 1
+- text: >-
+    We entered into the forest and
+  example_title: Example 2
+- text: >-
+    I sat for doing my homework
+  example_title: Example 3
+---
+# Custom GPT Model
+## Model Description
+This model, designed and pretrained from scratch, was developed without utilizing the Hugging Face library. It was independently trained on custom datasets, specifically focusing on tailored NLP tasks. The training process involved meticulous data preprocessing and training strategies to enhance its language understanding capabilities.
+## Model Parameters
+- **Block Size**: `256` (Maximum sequence length)
+- **Vocab Size**: `50257` (Includes 50,000 BPE merges, 256 byte-level tokens, and 1 special token)
+- **Number of Layers**: `8`
+- **Number of Heads**: `8`
+- **Embedding Dimension**: `768`
+- **Max Learning Rate**: `0.0006`
+- **Min Learning Rate**: `0.00006` (10% of max_lr)
+- **Warmup Steps**: `715`
+- **Max Steps**: `52000`
+- **Total Batch Size**: `524288` (Number of tokens per batch)
+- **Micro Batch Size**: `128`
+- **Sequence Length**: `256`
+### Tokenization
+For tokenization, this model uses:
+```python
+tokenizer = tiktoken.get_encoding("gpt2")