PhucDanh commited on
Commit
c7a568a
·
verified ·
1 Parent(s): cdfcbe4

Update README.md

Browse files

update README files

Files changed (1) hide show
  1. README.md +55 -38
README.md CHANGED
@@ -13,24 +13,22 @@ metrics:
13
  pipeline_tag: question-answering
14
  ---
15
  # Model Card for ViT5-base fine-tuned model for question answering task
16
-
17
  ## Overview
18
  ViT5 is a pretrained text-to-text transformer model designed specifically for Vietnamese language generation tasks. It is based on the T5 (Text-to-Text Transfer Transformer) architecture developed by Google, which has been adapted and fine-tuned for the Vietnamese language. ViT5 is capable of handling various natural language processing (NLP) tasks such as translation, summarization, question answering, and text generation, all within the Vietnamese linguistic context.
19
-
20
  ## Question answering view
21
- 1. Task Formulation
22
  In the text-to-text framework, the question answering task is formulated as "Answer the question: [question] Context: [context]".
23
  The input consists of a question and a related context (a passage or document) that contains the information needed to answer the question.
24
- 2. Input Processing
25
  Tokenization: The combined question and context are tokenized into subword units using ViT5's tokenizer, which is pretrained for Vietnamese.
26
  Task Specification: The input is prefixed with a task-specific instruction to help the model understand the nature of the task.
27
- 3. Encoding
28
  Embedding: The tokenized input is converted into embeddings.
29
  Self-Attention: The encoder applies self-attention mechanisms to generate context-aware representations of the input text, integrating information from both the question and the context.
30
- 4. Decoding
31
  Conditional Generation: The decoder generates the output text (the answer) based on the encoded representations. The cross-attention mechanism helps the decoder focus on relevant parts of the context while generating the answer.
32
  Output Tokenization: The generated tokens are converted back into human-readable text (the answer).
33
- 5. Post-Processing
34
  Detokenization: The output tokens are detokenized to form a coherent and fluent answer.
35
  Answer Extraction: The model's output is refined to ensure that the generated text is a precise and relevant answer to the input question.
36
 
@@ -52,24 +50,30 @@ The Vietnamese QA dataset, created by Nguyen et al. (2020), is known as UIT-ViQu
52
  - plausible_answer: For unanswerable questions, this provides a seemingly correct but actually incorrect answer extracted from the context.
53
 
54
  # The term for hyperparameters used in the fine-tuning process
55
- - epochs = 4
56
- - batch_size = 16
57
- - learning rate = 2e-5
58
- - evaluation strategy = "steps"
59
- - save_total_limit = 1
60
- - save_steps = 2000
61
- - eval_steps = 2000
62
- - gradient_accumulation_steps = 2
63
- - eval_accumulation_steps = 2
64
- - load_best_model_at_end = True
65
-
66
- # Best result
67
- - epoch = 3250207813798838
68
- - grad_norm = 136582374572754
69
- - learning_rate = 3.3610648918469217e-06
70
- - loss = 0.9397
71
- - step = 2000
72
- - eval_loss = 0.7907648682594299
 
 
 
 
 
 
73
 
74
  # Inference
75
  ## Using a pipeline as a high-level helper
@@ -84,7 +88,7 @@ question="""
84
  Trường UIT mang trong mình nhiệm vụ gì?
85
  """
86
 
87
- pipe = pipeline("question-answering", model="PhucDanh/Bartpho-fine-tuning-model-for-question-answering")
88
  pipe(question=question, context=context)
89
  ```
90
 
@@ -102,12 +106,15 @@ question="""
102
  Trường UIT mang trong mình nhiệm vụ gì?
103
  """
104
 
105
- tokenizer = AutoTokenizer.from_pretrained("PhucDanh/Bartpho-fine-tuning-model-for-question-answering")
106
- tokenizer.model_input_names.remove("token_type_ids")
 
 
 
107
 
108
  inputs = tokenizer(question, context, return_tensors="pt")
109
 
110
- model = AutoModelForQuestionAnswering.from_pretrained("PhucDanh/Bartpho-fine-tuning-model-for-question-answering")
111
  with torch.no_grad():
112
  outputs = model(**inputs)
113
 
@@ -123,7 +130,7 @@ Contact for API token authentication
123
  ```py
124
  import requests
125
 
126
- API_URL = "https://api-inference.huggingface.co/models/PhucDanh/Bartpho-fine-tuning-model-for-question-answering"
127
  headers = {"Authorization": "Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}
128
 
129
  def query(payload):
@@ -132,20 +139,30 @@ def query(payload):
132
 
133
  output = query({
134
  "inputs": {
135
- "question": "What is my name?",
136
- "context": "My name is Clara and I live in Berkeley."
137
- },
138
  })
139
  ```
140
 
141
  # Reference
142
  ## Model:
143
  ```
144
- @article{tran2021bartpho,
145
- title={BartPho: pre-trained sequence-to-sequence models for Vietnamese},
146
- author={Tran, Nguyen Luong and Le, Duong Minh and Nguyen, Dat Quoc},
147
- journal={arXiv preprint arXiv:2109.09701},
148
- year={2021}
 
 
 
 
 
 
 
 
 
 
149
  }
150
  ```
151
  ## Dataset:
 
13
  pipeline_tag: question-answering
14
  ---
15
  # Model Card for ViT5-base fine-tuned model for question answering task
 
16
  ## Overview
17
  ViT5 is a pretrained text-to-text transformer model designed specifically for Vietnamese language generation tasks. It is based on the T5 (Text-to-Text Transfer Transformer) architecture developed by Google, which has been adapted and fine-tuned for the Vietnamese language. ViT5 is capable of handling various natural language processing (NLP) tasks such as translation, summarization, question answering, and text generation, all within the Vietnamese linguistic context.
 
18
  ## Question answering view
19
+ 1. **Task Formulation**<br>
20
  In the text-to-text framework, the question answering task is formulated as "Answer the question: [question] Context: [context]".
21
  The input consists of a question and a related context (a passage or document) that contains the information needed to answer the question.
22
+ 3. **Input Processing**<br>
23
  Tokenization: The combined question and context are tokenized into subword units using ViT5's tokenizer, which is pretrained for Vietnamese.
24
  Task Specification: The input is prefixed with a task-specific instruction to help the model understand the nature of the task.
25
+ 4. **Encoding**<br>
26
  Embedding: The tokenized input is converted into embeddings.
27
  Self-Attention: The encoder applies self-attention mechanisms to generate context-aware representations of the input text, integrating information from both the question and the context.
28
+ 5. **Decoding**<br>
29
  Conditional Generation: The decoder generates the output text (the answer) based on the encoded representations. The cross-attention mechanism helps the decoder focus on relevant parts of the context while generating the answer.
30
  Output Tokenization: The generated tokens are converted back into human-readable text (the answer).
31
+ 6. **Post-Processing**<br>
32
  Detokenization: The output tokens are detokenized to form a coherent and fluent answer.
33
  Answer Extraction: The model's output is refined to ensure that the generated text is a precise and relevant answer to the input question.
34
 
 
50
  - plausible_answer: For unanswerable questions, this provides a seemingly correct but actually incorrect answer extracted from the context.
51
 
52
  # The term for hyperparameters used in the fine-tuning process
53
+ - epochs: 4
54
+ - batch_size: 16
55
+ - learning rate: 2e-5
56
+ - evaluation strategy: "steps"
57
+ - save_total_limit: 1
58
+ - save_steps: 2000
59
+ - eval_steps: 2000
60
+ - gradient_accumulation_steps: 2
61
+ - eval_accumulation_steps: 2
62
+ - load_best_model_at_end: True
63
+
64
+ # Best model saved while tuning
65
+ - epoch: 3.3264033264033266
66
+ - learning_rate: 3.3679833679833685e-06
67
+ - train_loss: 0.4473
68
+ - eval_loss: 1.2475123405456543
69
+
70
+ # Evaluation
71
+ - validation:
72
+ - F1-score: 75.4081
73
+ - Exact-match: 58.6788
74
+ - test:
75
+ - F1-score: 78.646
76
+ - Exact-match: 59.147
77
 
78
  # Inference
79
  ## Using a pipeline as a high-level helper
 
88
  Trường UIT mang trong mình nhiệm vụ gì?
89
  """
90
 
91
+ pipe = pipeline("question-answering", model="PhucDanh/vit5-fine-tuning-for-question-answering")
92
  pipe(question=question, context=context)
93
  ```
94
 
 
106
  Trường UIT mang trong mình nhiệm vụ gì?
107
  """
108
 
109
+ tokenizer = AutoTokenizer.from_pretrained("PhucDanh/vit5-fine-tuning-for-question-answering")
110
+ try:
111
+ tokenizer.model_input_names.remove("token_type_ids")
112
+ except:
113
+ print("already removed!!!")
114
 
115
  inputs = tokenizer(question, context, return_tensors="pt")
116
 
117
+ model = AutoModelForQuestionAnswering.from_pretrained("PhucDanh/vit5-fine-tuning-for-question-answering")
118
  with torch.no_grad():
119
  outputs = model(**inputs)
120
 
 
130
  ```py
131
  import requests
132
 
133
+ API_URL = "https://api-inference.huggingface.co/models/PhucDanh/vit5-fine-tuning-for-question-answering"
134
  headers = {"Authorization": "Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}
135
 
136
  def query(payload):
 
139
 
140
  output = query({
141
  "inputs": {
142
+ "question": "What is my name?",
143
+ "context": "My name is Clara and I live in Berkeley."
144
+ },
145
  })
146
  ```
147
 
148
  # Reference
149
  ## Model:
150
  ```
151
+ @article{phan2022vit5,
152
+ title={Vit5: Pretrained text-to-text transformer for vietnamese language generation},
153
+ author={Phan, Long and Tran, Hieu and Nguyen, Hieu and Trinh, Trieu H},
154
+ journal={arXiv preprint arXiv:2205.06457},
155
+ year={2022}
156
+ }
157
+
158
+ @article{raffel2020exploring,
159
+ title={Exploring the limits of transfer learning with a unified text-to-text transformer},
160
+ author={Raffel, Colin and Shazeer, Noam and Roberts, Adam and Lee, Katherine and Narang, Sharan and Matena, Michael and Zhou, Yanqi and Li, Wei and Liu, Peter J},
161
+ journal={Journal of machine learning research},
162
+ volume={21},
163
+ number={140},
164
+ pages={1--67},
165
+ year={2020}
166
  }
167
  ```
168
  ## Dataset: