Update README.md
Browse files
README.md
CHANGED
@@ -10,8 +10,8 @@ library_name: transformers
|
|
10 |
# T5-french-base Model
|
11 |
|
12 |
## Model Overview
|
13 |
-
The T5-French-Base model is a ~250M params only T5 model trained solely on French data from the RedPajama 2 dataset.
|
14 |
-
This model was trained for 85,000 steps and was only pre-trained without any supervised training.
|
15 |
Therefore, this model has to be fine-tuned before it is useable on a downstream task.
|
16 |
It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
|
17 |
Since the training compute buget was very limited, the model is mainly useful for research only.
|
@@ -19,7 +19,7 @@ Since the training compute buget was very limited, the model is mainly useful fo
|
|
19 |
## Model Details
|
20 |
- Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
|
21 |
- Training Dataset: RedPajama 2 dataset (French-only)
|
22 |
-
- Training Steps: 85,000
|
23 |
- Tokenizer: T5 Tokenizer
|
24 |
|
25 |
## Intended Use
|
@@ -33,12 +33,13 @@ It may be used as a starting point for fine-tuning on tasks such as:
|
|
33 |
## Limitations
|
34 |
The T5-French-Base model may not be suitable for user-facing, or production applications.
|
35 |
It is mainly meant for researchers only.
|
|
|
36 |
The training budget was really limited (85k steps only, ~250M params only, for a final loss of ~1.1).
|
37 |
The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
|
38 |
Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
|
39 |
|
40 |
## Ethical Considerations
|
41 |
-
The T5-French-Base model was trained on publicly available data and does not contain any known biases or ethical concerns.
|
42 |
However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
|
43 |
|
44 |
## Citation
|
|
|
10 |
# T5-french-base Model
|
11 |
|
12 |
## Model Overview
|
13 |
+
The T5-French-Base model is a ~250M params only T5 model trained (entirely from scratch) solely on French data from the RedPajama 2 dataset.
|
14 |
+
This model was trained for 85,000 steps and was only pre-trained from scratch without any supervised training.
|
15 |
Therefore, this model has to be fine-tuned before it is useable on a downstream task.
|
16 |
It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
|
17 |
Since the training compute buget was very limited, the model is mainly useful for research only.
|
|
|
19 |
## Model Details
|
20 |
- Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
|
21 |
- Training Dataset: RedPajama 2 dataset (French-only)
|
22 |
+
- Training Steps: 85,000 (from scratch)
|
23 |
- Tokenizer: T5 Tokenizer
|
24 |
|
25 |
## Intended Use
|
|
|
33 |
## Limitations
|
34 |
The T5-French-Base model may not be suitable for user-facing, or production applications.
|
35 |
It is mainly meant for researchers only.
|
36 |
+
It was trained entirely from scratch.
|
37 |
The training budget was really limited (85k steps only, ~250M params only, for a final loss of ~1.1).
|
38 |
The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
|
39 |
Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
|
40 |
|
41 |
## Ethical Considerations
|
42 |
+
The T5-French-Base model was trained from scratch on publicly available data and does not contain any known biases or ethical concerns.
|
43 |
However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
|
44 |
|
45 |
## Citation
|