guillaumephd commited on
Commit
2afb858
·
verified ·
1 Parent(s): 01a69c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -10,8 +10,8 @@ library_name: transformers
10
  # T5-french-base Model
11
 
12
  ## Model Overview
13
- The T5-French-Base model is a ~250M params only T5 model trained solely on French data from the RedPajama 2 dataset.
14
- This model was trained for 85,000 steps and was only pre-trained without any supervised training.
15
  Therefore, this model has to be fine-tuned before it is useable on a downstream task.
16
  It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
17
  Since the training compute buget was very limited, the model is mainly useful for research only.
@@ -19,7 +19,7 @@ Since the training compute buget was very limited, the model is mainly useful fo
19
  ## Model Details
20
  - Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
21
  - Training Dataset: RedPajama 2 dataset (French-only)
22
- - Training Steps: 85,000
23
  - Tokenizer: T5 Tokenizer
24
 
25
  ## Intended Use
@@ -33,12 +33,13 @@ It may be used as a starting point for fine-tuning on tasks such as:
33
  ## Limitations
34
  The T5-French-Base model may not be suitable for user-facing, or production applications.
35
  It is mainly meant for researchers only.
 
36
  The training budget was really limited (85k steps only, ~250M params only, for a final loss of ~1.1).
37
  The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
38
  Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
39
 
40
  ## Ethical Considerations
41
- The T5-French-Base model was trained on publicly available data and does not contain any known biases or ethical concerns.
42
  However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
43
 
44
  ## Citation
 
10
  # T5-french-base Model
11
 
12
  ## Model Overview
13
+ The T5-French-Base model is a ~250M params only T5 model trained (entirely from scratch) solely on French data from the RedPajama 2 dataset.
14
+ This model was trained for 85,000 steps and was only pre-trained from scratch without any supervised training.
15
  Therefore, this model has to be fine-tuned before it is useable on a downstream task.
16
  It is intended to serve as a foundation for further fine-tuning and as a starting point for downstream tasks in the French language.
17
  Since the training compute buget was very limited, the model is mainly useful for research only.
 
19
  ## Model Details
20
  - Model Architecture: T5 Base, version 1.1 (GEGLU activation in feed-forward hidden layer, rather than ReLU)
21
  - Training Dataset: RedPajama 2 dataset (French-only)
22
+ - Training Steps: 85,000 (from scratch)
23
  - Tokenizer: T5 Tokenizer
24
 
25
  ## Intended Use
 
33
  ## Limitations
34
  The T5-French-Base model may not be suitable for user-facing, or production applications.
35
  It is mainly meant for researchers only.
36
+ It was trained entirely from scratch.
37
  The training budget was really limited (85k steps only, ~250M params only, for a final loss of ~1.1).
38
  The model is a base model that hasn't been fine-tuned yet. As such, it does NOT follow instructions.
39
  Additionally, the model was trained solely on French data and won't work for tasks that require cross-lingual understanding or multilingual capabilities.
40
 
41
  ## Ethical Considerations
42
+ The T5-French-Base model was trained from scratch on publicly available data and does not contain any known biases or ethical concerns.
43
  However, researchers should be aware of potential biases in the RedPajama 2 training data and should carefully evaluate the model's outputs for any unintended consequences.
44
 
45
  ## Citation