File size: 2,614 Bytes

7da4ac3
 
 
 
 
 
79cb92c
 
 
7da4ac3
79cb92c
7da4ac3
79cb92c
7da4ac3
79cb92c
7da4ac3
79cb92c
7da4ac3
79cb92c
 
 
7da4ac3
79cb92c
7da4ac3
79cb92c
 
 
 
7da4ac3
79cb92c
7da4ac3
79cb92c
 
 
 
7da4ac3
79cb92c
 
 
 
7da4ac3
79cb92c
7da4ac3
79cb92c
 
7da4ac3
79cb92c
 
 
7da4ac3
79cb92c
7da4ac3
79cb92c
 
7da4ac3
79cb92c
 
7da4ac3
79cb92c
7da4ac3
79cb92c
 
 
7da4ac3
79cb92c
 
 
 
 
7da4ac3
79cb92c
 
 
 
 
 
7da4ac3
79cb92c
7da4ac3
 
79cb92c
7da4ac3
79cb92c

---
language:
- en
license: apache-2.0
base_model: google/flan-t5-small
tags:
- text-simplification
- paraphrase
- natural-language-processing
datasets:
- agentlans/sentence-paraphrases
---
# FLAN-T5 Small Simplifier

A fine-tuned text simplification and paraphrasing model based on Google's FLAN-T5 Small, designed to enhance text readability while preserving core semantic meaning.

## Model Details

- **Base Model**: [google/flan-t5-small](https://huggingface.co/google/flan-t5-small)
- **Task**: Text Simplification and Paraphrasing
- **Languages**: English

## Capabilities

The model is specialized in:
- Reducing text complexity
- Generating more readable paraphrases
- Maintaining original semantic content

## Intended Use

**Primary Use Cases**:
- Academic writing simplification
- Technical document readability enhancement
- Content adaptation for diverse audiences

**Limitations**:
- Optimized for English language texts
- Best performance on sentence-length inputs
- May struggle with highly specialized or mixed-language texts

## Usage Example

```python
from transformers import pipeline

simplifier = pipeline(
    "text2text-generation", model="agentlans/flan-t5-small-simplifier"
)

complex_text = "While navigating the labyrinthine corridors of epistemological uncertainty, the precocious philosopher paused to contemplate the intricate interplay between subjective perception and objective reality."

simplified_text = simplifier(complex_text, max_length=128)[0]["generated_text"]
print(simplified_text)

# The precocious philosopher paused to contemplate the complex interplay between subjective perception and objective reality while navigating the labyrinthine corridors of epistemological uncertainty.
```

## Training Details

**Dataset**: [agentlans/sentence-paraphrases](https://huggingface.co/datasets/agentlans/sentence-paraphrases)
- Source: Curated paraphrase collections
- Readability assessment using a finetuned [DeBERTa v3 XSmall](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2-readability)

**Training Hyperparameters**:
- Learning Rate: 5e-05
- Batch Size: 8
- Optimizer: Adam
- Epochs: 2.0

**Performance Metrics**:
| Epoch | Training Loss | Validation Loss |
|:-----:|:-------------:|:---------------:|
| 0.22  | 1.4423        | 1.2431          |
| 0.89  | 1.3595        | 1.1787          |
| 1.78  | 1.2952        | 1.1518          |

## Framework

- Transformers 4.43.3
- PyTorch 2.3.0+cu121
- Datasets 3.2.0

## Ethical Considerations

Users should review generated text for accuracy and appropriateness, as the model may inherit biases from training data.