agentlans
/

flan-t5-small-simplifier

text-simplification

natural-language-processing

Model card Files Files and versions Community

flan-t5-small-simplifier / README.md

agentlans's picture

Update README.md

79cb92c verified 2 months ago

|

history blame contribute delete

2.61 kB

	---
	language:
	- en
	license: apache-2.0
	base_model: google/flan-t5-small
	tags:
	- text-simplification
	- paraphrase
	- natural-language-processing
	datasets:
	- agentlans/sentence-paraphrases
	---
	# FLAN-T5 Small Simplifier

	A fine-tuned text simplification and paraphrasing model based on Google's FLAN-T5 Small, designed to enhance text readability while preserving core semantic meaning.

	## Model Details

	- Base Model: [google/flan-t5-small](https://huggingface.co/google/flan-t5-small)
	- Task: Text Simplification and Paraphrasing
	- Languages: English

	## Capabilities

	The model is specialized in:
	- Reducing text complexity
	- Generating more readable paraphrases
	- Maintaining original semantic content

	## Intended Use

	Primary Use Cases:
	- Academic writing simplification
	- Technical document readability enhancement
	- Content adaptation for diverse audiences

	Limitations:
	- Optimized for English language texts
	- Best performance on sentence-length inputs
	- May struggle with highly specialized or mixed-language texts

	## Usage Example

	```python
	from transformers import pipeline

	simplifier = pipeline(
	"text2text-generation", model="agentlans/flan-t5-small-simplifier"
	)

	complex_text = "While navigating the labyrinthine corridors of epistemological uncertainty, the precocious philosopher paused to contemplate the intricate interplay between subjective perception and objective reality."

	simplified_text = simplifier(complex_text, max_length=128)[0]["generated_text"]
	print(simplified_text)

	# The precocious philosopher paused to contemplate the complex interplay between subjective perception and objective reality while navigating the labyrinthine corridors of epistemological uncertainty.
	```

	## Training Details

	Dataset: [agentlans/sentence-paraphrases](https://huggingface.co/datasets/agentlans/sentence-paraphrases)
	- Source: Curated paraphrase collections
	- Readability assessment using a finetuned [DeBERTa v3 XSmall](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2-readability)

	Training Hyperparameters:
	- Learning Rate: 5e-05
	- Batch Size: 8
	- Optimizer: Adam
	- Epochs: 2.0

	Performance Metrics:
	\| Epoch \| Training Loss \| Validation Loss \|
	\|:-----:\|:-------------:\|:---------------:\|
	\| 0.22 \| 1.4423 \| 1.2431 \|
	\| 0.89 \| 1.3595 \| 1.1787 \|
	\| 1.78 \| 1.2952 \| 1.1518 \|

	## Framework

	- Transformers 4.43.3
	- PyTorch 2.3.0+cu121
	- Datasets 3.2.0

	## Ethical Considerations

	Users should review generated text for accuracy and appropriateness, as the model may inherit biases from training data.