temporary0-0name commited on
Commit
a9d7cf6
·
verified ·
1 Parent(s): 2535913

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -3
README.md CHANGED
@@ -1,3 +1,42 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - hi
5
+ - en
6
+ metrics:
7
+ - perplexity
8
+ widget:
9
+ - text: >-
10
+ It is raining and my family
11
+ example_title: Example 1
12
+ - text: >-
13
+ We entered into the forest and
14
+ example_title: Example 2
15
+ - text: >-
16
+ I sat for doing my homework
17
+ example_title: Example 3
18
+ ---
19
+
20
+ # Custom GPT Model
21
+
22
+ ## Model Description
23
+ This model, designed and pretrained from scratch, was developed without utilizing the Hugging Face library. It was independently trained on custom datasets, specifically focusing on tailored NLP tasks. The training process involved meticulous data preprocessing and training strategies to enhance its language understanding capabilities.
24
+
25
+ ## Model Parameters
26
+ - **Block Size**: `256` (Maximum sequence length)
27
+ - **Vocab Size**: `50257` (Includes 50,000 BPE merges, 256 byte-level tokens, and 1 special token)
28
+ - **Number of Layers**: `8`
29
+ - **Number of Heads**: `8`
30
+ - **Embedding Dimension**: `768`
31
+ - **Max Learning Rate**: `0.0006`
32
+ - **Min Learning Rate**: `0.00006` (10% of max_lr)
33
+ - **Warmup Steps**: `715`
34
+ - **Max Steps**: `52000`
35
+ - **Total Batch Size**: `524288` (Number of tokens per batch)
36
+ - **Micro Batch Size**: `128`
37
+ - **Sequence Length**: `256`
38
+
39
+ ### Tokenization
40
+ For tokenization, this model uses:
41
+ ```python
42
+ tokenizer = tiktoken.get_encoding("gpt2")