agentlans's picture
Update README.md
79cb92c verified
---
language:
- en
license: apache-2.0
base_model: google/flan-t5-small
tags:
- text-simplification
- paraphrase
- natural-language-processing
datasets:
- agentlans/sentence-paraphrases
---
# FLAN-T5 Small Simplifier
A fine-tuned text simplification and paraphrasing model based on Google's FLAN-T5 Small, designed to enhance text readability while preserving core semantic meaning.
## Model Details
- **Base Model**: [google/flan-t5-small](https://huggingface.co/google/flan-t5-small)
- **Task**: Text Simplification and Paraphrasing
- **Languages**: English
## Capabilities
The model is specialized in:
- Reducing text complexity
- Generating more readable paraphrases
- Maintaining original semantic content
## Intended Use
**Primary Use Cases**:
- Academic writing simplification
- Technical document readability enhancement
- Content adaptation for diverse audiences
**Limitations**:
- Optimized for English language texts
- Best performance on sentence-length inputs
- May struggle with highly specialized or mixed-language texts
## Usage Example
```python
from transformers import pipeline
simplifier = pipeline(
"text2text-generation", model="agentlans/flan-t5-small-simplifier"
)
complex_text = "While navigating the labyrinthine corridors of epistemological uncertainty, the precocious philosopher paused to contemplate the intricate interplay between subjective perception and objective reality."
simplified_text = simplifier(complex_text, max_length=128)[0]["generated_text"]
print(simplified_text)
# The precocious philosopher paused to contemplate the complex interplay between subjective perception and objective reality while navigating the labyrinthine corridors of epistemological uncertainty.
```
## Training Details
**Dataset**: [agentlans/sentence-paraphrases](https://huggingface.co/datasets/agentlans/sentence-paraphrases)
- Source: Curated paraphrase collections
- Readability assessment using a finetuned [DeBERTa v3 XSmall](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2-readability)
**Training Hyperparameters**:
- Learning Rate: 5e-05
- Batch Size: 8
- Optimizer: Adam
- Epochs: 2.0
**Performance Metrics**:
| Epoch | Training Loss | Validation Loss |
|:-----:|:-------------:|:---------------:|
| 0.22 | 1.4423 | 1.2431 |
| 0.89 | 1.3595 | 1.1787 |
| 1.78 | 1.2952 | 1.1518 |
## Framework
- Transformers 4.43.3
- PyTorch 2.3.0+cu121
- Datasets 3.2.0
## Ethical Considerations
Users should review generated text for accuracy and appropriateness, as the model may inherit biases from training data.