atsuki-yamaguchi commited on
Commit
e4e7d8b
·
verified ·
1 Parent(s): 71a54dc

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +34 -4
README.md CHANGED
@@ -1,3 +1,4 @@
 
1
  ---
2
  license: llama3
3
  language:
@@ -5,10 +6,39 @@ language:
5
  base_model: meta-llama/Meta-Llama-3-8B
6
  library_name: transformers
7
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- # Sinhala LLaMA3-8B model
 
 
 
 
 
 
10
 
11
- This model was adapted for Sinhala with 30K target language sentences + Random + T&B 2LS + MTP + 512.
12
 
13
- For technical details, please read the paper: https://arxiv.org/abs/2406.11477.
14
- For implementation details, please see the code repository: https://github.com/gucci-j/lowres-cve.
 
1
+
2
  ---
3
  license: llama3
4
  language:
 
6
  base_model: meta-llama/Meta-Llama-3-8B
7
  library_name: transformers
8
  ---
9
+ # Llama3 8B for Sinhala: 100 target vocabulary size + Random target vocabulary initialization + T&B2LS/MTP/512 training
10
+
11
+ This model is built on top of Llama3 8B adapted for Sinhala using 30K target language sentences sampled from CC-100.
12
+
13
+ ## Model Details
14
+
15
+ * **Vocabulary**: This model has an additional 100 target vocabulary.
16
+ * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using Random initialization.
17
+ * **Training**: This model was additionally pre-trained on 30K target language sentences sampled from CC-100. The training was conducted with the T&B2LS/MTP/512 strategies introduced in the paper.
18
+
19
+ ## Model Description
20
+
21
+ - **Language:** Sinhala
22
+ - **License:** Llama 3 Community License Agreement
23
+ - **Fine-tuned from model:** meta-llama/Meta-Llama-3-8B
24
+
25
+
26
+ ## Model Sources
27
+
28
+ - **Repository:** https://github.com/gucci-j/lowres-cve
29
+ - **Paper:** https://arxiv.org/abs/2406.11477
30
+
31
+ ## How to Get Started with the Model
32
+ Use the code below to get started with the model.
33
+ ```python
34
+ from transformers import AutoTokenizer, AutoModelForCausalLM
35
 
36
+ model = AutoModelForCausalLM.from_pretrained(
37
+ "atsuki-yamaguchi/Llama-3-8B-si-30K-100-rand"
38
+ )
39
+ tokenizer = AutoTokenizer.from_pretrained(
40
+ "atsuki-yamaguchi/Llama-3-8B-si-30K-100-rand"
41
+ )
42
+ ```
43
 
 
44