1TuanPham commited on
Commit
9255e5c
·
verified ·
1 Parent(s): 36782aa

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +128 -0
README.md ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - vi
5
+ - en
6
+ ---
7
+
8
+ ## Model Details
9
+
10
+ ### Model Description
11
+
12
+ <!-- Provide a longer summary of what this model is. -->
13
+
14
+ - **Developed by:** Tuan Pham (FPTU HCM Student)
15
+ - **Model type:** Llama2-7B Decoder-only
16
+ - **Finetuned from model :** meta-llama/Llama-2-7b, bkai-foundation-models/vietnamese-llama2-7b-120GB, yeen214/llama2_7b_merge_orcafamily.
17
+ - **Bilingual support :** English and Vietnamese
18
+
19
+ ### Model Sources
20
+
21
+ <!-- Provide the basic links for the model. -->
22
+
23
+ - **Repository:**
24
+ * Training: https://github.com/vTuanpham/Vietnamese_QA_System
25
+ * Data: https://github.com/vTuanpham/Large_dataset_translator
26
+ - **Paper:** ...
27
+ - **Demo:** ...
28
+
29
+ ## Uses
30
+
31
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
32
+
33
+ ### Prompt template
34
+
35
+ ```
36
+ [SYSTEM_PROMPT]
37
+
38
+ ####### Instruction:
39
+ [INPUT]
40
+
41
+ %%%%%%% Response:
42
+ [RESPONSE]
43
+ ```
44
+
45
+ ## How to Get Started with the Model
46
+
47
+ Use the code below to get started with the model.
48
+ ```python
49
+ from torch.cuda.amp import autocast
50
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline
51
+
52
+ model_name = "1TuanPham/InstructEnVi_llama2-bkai-120GB-Orcafamily_250kx3.37_350kx1.1"
53
+ model = AutoModelForCausalLM.from_pretrained(model_name,
54
+ torch_dtype=torch.bfloat16,
55
+ use_cache=True,
56
+ )
57
+ tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
58
+ streamer = TextStreamer(tokenizer, skip_special_tokens=True)
59
+ pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer)
60
+
61
+ with autocast():
62
+ output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128)
63
+
64
+ ```
65
+ ## Training Details
66
+
67
+ **Hardware Type:**
68
+ * GPU: VGA NVIDIA Tesla P100 16GB
69
+ * SYSTEM RAM: 29GB
70
+
71
+ **Hours used:** ~42.5 Approx*
72
+
73
+ ### Training Data
74
+
75
+ * BactrianX
76
+ * OpenOrca_translated
77
+ * WizardLM_70k_translated
78
+ * TigerLabMathInstruct_translated_vi
79
+ * GradeSchoolMathInstruct_translated
80
+ * vilm_lima-vi
81
+ * MTEngVietnamese
82
+ * databricks_dolly15k_translated
83
+ * AlpacaCleaned_translated
84
+ * databricks_dolly15k
85
+ * OpenOrca
86
+ * GradeSchoolMathInstruct
87
+ * AlpacaCleaned
88
+ * WebglmQA
89
+
90
+ ### Training Procedure
91
+
92
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
93
+
94
+ * Learning rate: 2e-5 cosine
95
+ * Optimizer: PagedLion8bit
96
+ * QLora: rank: 64 /Q: 4-bit
97
+
98
+ - 250k examples of 70% Vietnamese 30% English for 3.37 epoch
99
+ - 350k examples of 60% Vietnamese 40% English for 1.1 epoch
100
+
101
+ ### Training loss
102
+
103
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/3e7Ep0KQ6qNAMqnL6bmyE.png)
104
+
105
+ ## Evaluation
106
+
107
+ <!-- This section describes the evaluation protocols and provides the results. -->
108
+
109
+ ### Results
110
+
111
+ [More Information Needed]
112
+
113
+ ## Technical Specifications
114
+
115
+ ### Model Architecture and Objective
116
+
117
+ [More Information Needed]
118
+
119
+ ## Citation
120
+
121
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
122
+
123
+ ## Model Card Authors
124
+
125
+
126
+ ## Model Card Contact
127
+
128
+ [More Information Needed]