T5-Small for News Headline Generation
This is a T5-Small model fine-tuned for generating concise and informative news headlines from article summaries. It is useful for news agencies, content creators, and media professionals to generate headlines efficiently.
Model Details
Model Type: Sequence-to-Sequence Transformer
Base Model: t5-small
Maximum Sequence Length: 128 tokens (input and output)
Output: News headlines based on input summaries
Task: Text Summarization (Headline Generation)
Model Sources
Documentation: T5 Model Documentation
Repository: Hugging Face Model Hub
Hugging Face Model: Available on Hugging Face
Full Model Architecture
T5ForConditionalGeneration(
(shared): Embedding(32128, 512)
(encoder): T5Stack(
(embed_tokens): Embedding(32128, 512)
(block): ModuleList(...)
(final_layer_norm): LayerNorm((512,), eps=1e-12)
(dropout): Dropout(p=0.1)
)
(decoder): T5Stack(
(embed_tokens): Embedding(32128, 512)
(block): ModuleList(...)
(final_layer_norm): LayerNorm((512,), eps=1e-12)
(dropout): Dropout(p=0.1)
)
(lm_head): Linear(in_features=512, out_features=32128, bias=False)
)
Installation and Setup
pip install -U transformers torch datasets
Load the Model and Run Inference
from transformers import T5ForConditionalGeneration, T5Tokenizer
import torch
# Model Name
model_name = "your_fine_tuned_model_id"
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)
# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Inference
news_summary = "Ministry of Education has announced a major reform in the national curriculum to enhance digital literacy among students."
inputs = tokenizer(news_summary, max_length=128, truncation=True, padding="max_length", return_tensors="pt").to(device)
outputs = model.generate(
input_ids=inputs["input_ids"],
attention_mask=inputs["attention_mask"],
max_length=20,
num_beams=5,
early_stopping=True
)
headline = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Generated Headline: {headline}")
Training Details
Training Dataset
Dataset Name: News Headlines Dataset
Size: 30,000 rows
Columns: article_summary (input), headline (output)
Approximate Statistics
article_summary:
Type: string
Min length: ~20 tokens
Mean length: ~50-60 tokens (estimated)
Max length: ~128 tokens
headline:
Type: string
Min length: ~5 tokens
Mean length: ~10-15 tokens
Max length: ~20 tokens
Training Hyperparameters
- per_device_train_batch_size: 8
- per_device_eval_batch_size: 8
- gradient_accumulation_steps: 2
- num_train_epochs: 4
- learning_rate: 5e-5
- fp16: True
This model is optimized for news headline generation, ensuring concise, accurate, and informative outputs. 🚀
- Downloads last month
- 3