Update README.md
Browse files
README.md
CHANGED
@@ -10,12 +10,12 @@ tags:
|
|
10 |
---
|
11 |
# Dante-Zero Fine-tuned Model
|
12 |
|
13 |
-
This model was fine-tuned using Reinforcement Learning with
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
- **Base Model:** PleIAs/Pleias-350m-Preview
|
18 |
-
- **Training Method:** GRPO (
|
19 |
- **Training Data:** 1,000 chunks from Dante's Divine Comedy
|
20 |
- **Epochs:** 10
|
21 |
- **Trained By:** ruggsea
|
|
|
10 |
---
|
11 |
# Dante-Zero Fine-tuned Model
|
12 |
|
13 |
+
This model was fine-tuned using Reinforcement Learning with Group Relative Policy Optimization (GRPO) to generate Dante-style poetry in endecasillabi (11-syllable lines).
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
- **Base Model:** PleIAs/Pleias-350m-Preview
|
18 |
+
- **Training Method:** GRPO (Group Relative Policy Optimization )
|
19 |
- **Training Data:** 1,000 chunks from Dante's Divine Comedy
|
20 |
- **Epochs:** 10
|
21 |
- **Trained By:** ruggsea
|