theta
/

gpt2-reporter

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

theta commited on Dec 14, 2022

Commit

a51201b

·

1 Parent(s): f9b411a

update model card README.md

Files changed (1) hide show

README.md +15 -5

README.md CHANGED Viewed

@@ -1,5 +1,4 @@
 ---
-license: mit
 tags:
 - generated_from_trainer
 model-index:
@@ -12,7 +11,9 @@ should probably proofread and complete it, then remove this comment. -->
 # gpt2-reporter
-This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 ## Model description
@@ -32,16 +33,25 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 32
-- eval_batch_size: 32
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
-- num_epochs: 3
 ### Training results
 ### Framework versions

 ---
 tags:
 - generated_from_trainer
 model-index:
 # gpt2-reporter
+This model is a fine-tuned version of [uer/gpt2-chinese-cluecorpussmall](https://huggingface.co/uer/gpt2-chinese-cluecorpussmall) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.4819
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
+- num_epochs: 2
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 2.7694        | 0.28  | 400  | 2.5751          |
+| 2.6336        | 0.56  | 800  | 2.5318          |
+| 2.5564        | 0.84  | 1200 | 2.5071          |
+| 2.482         | 1.12  | 1600 | 2.4993          |
+| 2.4243        | 1.4   | 2000 | 2.4910          |
+| 2.4009        | 1.68  | 2400 | 2.4850          |
+| 2.3865        | 1.96  | 2800 | 2.4819          |
 ### Framework versions