mav23 commited on
Commit
cdcf913
·
verified ·
1 Parent(s): 49a02c2

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ egyptian-arabic-translator-llama-3-8b.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,233 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: meta-llama/Meta-Llama-3-8B
3
+ license: llama3
4
+ tags:
5
+ - axolotl
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: Egyptian-Arabic-Translator-Llama-3-8B
9
+ results: []
10
+ ---
11
+
12
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
13
+ <details><summary>See axolotl config</summary>
14
+
15
+ axolotl version: `0.4.1`
16
+ ```yaml
17
+ base_model: meta-llama/Meta-Llama-3-8B
18
+ model_type: LlamaForCausalLM
19
+ tokenizer_type: AutoTokenizer
20
+
21
+ load_in_8bit: true
22
+ load_in_4bit: false
23
+ strict: false
24
+
25
+ datasets:
26
+ - path: translation-dataset-v3-train.hf
27
+ type: alpaca
28
+ train_on_split: train
29
+
30
+ test_datasets:
31
+ - path: translation-dataset-v3-test.hf
32
+ type: alpaca
33
+ split: train
34
+
35
+ dataset_prepared_path: ./last_run_prepared
36
+ output_dir: ./llama_3_translator
37
+ hub_model_id: ahmedsamirio/llama_3_translator_v3
38
+
39
+
40
+ sequence_len: 2048
41
+ sample_packing: true
42
+ pad_to_sequence_len: true
43
+ eval_sample_packing: false
44
+
45
+ adapter: lora
46
+ lora_r: 32
47
+ lora_alpha: 16
48
+ lora_dropout: 0.05
49
+ lora_target_linear: true
50
+ lora_fan_in_fan_out:
51
+ lora_target_modules:
52
+ - gate_proj
53
+ - down_proj
54
+ - up_proj
55
+ - q_proj
56
+ - v_proj
57
+ - k_proj
58
+ - o_proj
59
+
60
+ wandb_project: en_eg_translator
61
+ wandb_entity: ahmedsamirio
62
+ wandb_name: llama_3_en_eg_translator_v3
63
+
64
+ gradient_accumulation_steps: 4
65
+ micro_batch_size: 2
66
+ num_epochs: 2
67
+ optimizer: paged_adamw_32bit
68
+ lr_scheduler: cosine
69
+ learning_rate: 2e-5
70
+
71
+ train_on_inputs: false
72
+ group_by_length: false
73
+ bf16: auto
74
+ fp16:
75
+ tf32: false
76
+
77
+ gradient_checkpointing: true
78
+ early_stopping_patience:
79
+ resume_from_checkpoint:
80
+ local_rank:
81
+ logging_steps: 1
82
+ xformers_attention:
83
+ flash_attention: true
84
+
85
+ warmup_steps: 10
86
+ evals_per_epoch: 10
87
+ eval_table_size:
88
+ eval_max_new_tokens: 128
89
+ saves_per_epoch: 1
90
+ debug:
91
+ deepspeed:
92
+ weight_decay: 0.0
93
+ fsdp:
94
+ fsdp_config:
95
+ special_tokens:
96
+ pad_token: <|end_of_text|>
97
+ ```
98
+
99
+ </details><br>
100
+
101
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/ahmedsamirio/en_eg_translator/runs/hwzxxt0r)
102
+
103
+ # Egyptian Arabic Translator Llama-3 8B
104
+
105
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the [ahmedsamirio/oasst2-9k-translation](https://huggingface.co/datasets/ahmedsamirio/oasst2-9k-translation) dataset.
106
+
107
+ ## Model description
108
+
109
+ This model is an attempt to create a small translation model from English to Egyptian Arabic.
110
+
111
+ ## Intended uses & limitations
112
+
113
+ - Translating instruction finetuning and text generation datasets
114
+
115
+ ## Inference code
116
+
117
+ ```python
118
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
119
+
120
+ tokenizer = AutoTokenizer.from_pretrained("ahmedsamirio/Egyptian-Arabic-Translator-Llama-3-8B")
121
+ model = AutoModelForCausalLM.from_pretrained("ahmedsamirio/Egyptian-Arabic-Translator-Llama-3-8B")
122
+ pipe = pipeline(task='text-generation', model=model, tokenizer=tokenizer)
123
+
124
+
125
+ en_template = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
126
+
127
+ ### Instruction:
128
+ Translate the following text to English.
129
+
130
+ ### Input:
131
+ {text}
132
+
133
+ ### Response:
134
+ """
135
+
136
+ ar_template = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
137
+
138
+ ### Instruction:
139
+ Translate the following text to Arabic.
140
+
141
+ ### Input:
142
+ {text}
143
+
144
+ ### Response:
145
+ """
146
+
147
+ eg_template = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
148
+
149
+ ### Instruction:
150
+ Translate the following text to Egyptian Arabic.
151
+
152
+ ### Input:
153
+ {text}
154
+
155
+ ### Response:
156
+ """
157
+
158
+ text = """Some habits are known as "keystone habits," and these influence the formation of other habits. \
159
+ For example, identifying as the type of person who takes care of their body and is in the habit of exercising regularly, \
160
+ can also influence eating better and using credit cards less. In business, \
161
+ safety can be a keystone habit that influences other habits that result in greater productivity.[17]"""
162
+
163
+ ar_text = pipe(ar_template.format(text=text),
164
+ max_new_tokens=256,
165
+ do_sample=True,
166
+ temperature=0.3,
167
+ top_p=0.5)
168
+
169
+
170
+ eg_text = pipe(eg_template.format(text=ar_text),
171
+ max_new_tokens=256,
172
+ do_sample=True,
173
+ temperature=0.3,
174
+ top_p=0.5)
175
+
176
+ print("Original Text:" text)
177
+ print("\nArabic Translation:", ar_text)
178
+ print("\nEgyptian Arabic Translation:", eg_text)
179
+ ```
180
+
181
+ ## Training and evaluation data
182
+
183
+ [ahmedsamirio/oasst2-9k-translation](https://huggingface.co/datasets/ahmedsamirio/oasst2-9k-translation)
184
+
185
+ ## Training procedure
186
+
187
+ ### Training hyperparameters
188
+
189
+ The following hyperparameters were used during training:
190
+ - learning_rate: 2e-05
191
+ - train_batch_size: 2
192
+ - eval_batch_size: 2
193
+ - seed: 42
194
+ - gradient_accumulation_steps: 4
195
+ - total_train_batch_size: 8
196
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
197
+ - lr_scheduler_type: cosine
198
+ - lr_scheduler_warmup_steps: 10
199
+ - num_epochs: 2
200
+
201
+ ### Training results
202
+
203
+ | Training Loss | Epoch | Step | Validation Loss |
204
+ |:-------------:|:------:|:----:|:---------------:|
205
+ | 0.9661 | 0.0008 | 1 | 1.3816 |
206
+ | 0.5611 | 0.1002 | 123 | 0.9894 |
207
+ | 0.6739 | 0.2004 | 246 | 0.8820 |
208
+ | 0.5168 | 0.3006 | 369 | 0.8229 |
209
+ | 0.5582 | 0.4008 | 492 | 0.7931 |
210
+ | 0.552 | 0.5010 | 615 | 0.7814 |
211
+ | 0.5129 | 0.6012 | 738 | 0.7591 |
212
+ | 0.5887 | 0.7014 | 861 | 0.7444 |
213
+ | 0.6359 | 0.8016 | 984 | 0.7293 |
214
+ | 0.613 | 0.9018 | 1107 | 0.7179 |
215
+ | 0.5671 | 1.0020 | 1230 | 0.7126 |
216
+ | 0.4956 | 1.0847 | 1353 | 0.7034 |
217
+ | 0.5055 | 1.1849 | 1476 | 0.6980 |
218
+ | 0.4863 | 1.2851 | 1599 | 0.6877 |
219
+ | 0.4538 | 1.3853 | 1722 | 0.6845 |
220
+ | 0.4362 | 1.4855 | 1845 | 0.6803 |
221
+ | 0.4291 | 1.5857 | 1968 | 0.6834 |
222
+ | 0.6208 | 1.6859 | 2091 | 0.6830 |
223
+ | 0.582 | 1.7862 | 2214 | 0.6781 |
224
+ | 0.5001 | 1.8864 | 2337 | 0.6798 |
225
+
226
+
227
+ ### Framework versions
228
+
229
+ - PEFT 0.11.1
230
+ - Transformers 4.42.3
231
+ - Pytorch 2.1.2+cu118
232
+ - Datasets 2.19.1
233
+ - Tokenizers 0.19.1
egyptian-arabic-translator-llama-3-8b.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9dbc4f9175749cb58c8a18b51c00d90e170f075cce66d342094b8c7f6e0c0f9e
3
+ size 4661211584