A newer version of this model is available: facebook/bart-large

Model Description

Model - fulltrain-xsum-bart

  • Architecture - BART (Bidirectional and Auto-Regressive Transformers)
  • Task - Abstractive Summarization
  • Dataset - XSum (Extreme Summarization)
  • Training Hardware - 2x NVIDIA T4 GPUs (using Kaggle)
  • Training Time: ~9 hours

This model is fine-tuned on the XSum dataset for abstractive summarization tasks. It takes a long document as input and generates a concise summary

Dataset Details

  • Train Dataset - 204,045 samples
  • Validation Dataset - 11,332 samples
  • Test Dataset - 11,334 samples

The XSum dataset consists of BBC articles and their corresponding single-sentence summaries. The model was trained to generate summaries that are concise and capture the essence of the input document.

Training Details

Training Parameter Value
Training Epochs 1
Batch Size 8 (per device)
Learning Rate 5e-5
Weight Decay 0.01
Warmup Steps 500
FP16 Training Enabled
Evaluation Strategy Per Epoch
Best Model Selection Based on validation loss (eval_loss)

Evaluation Metrics

The model was evaluated using the following metrics.

Metric Score
Training Loss 0.3771
Validation Loss 0.350379
Rouge-1 0.401344019
Rouge-2 0.188076798
Rouge-L 0.33460693

These metrics were computed using the rouge_scorer library for ROUGE scores.

Training Arguments

The model was trained using the following Hugging Face Seq2SeqTrainingArguments:

Arguments Value
Save Strategy Per Epoch
Logging Steps 1000
Dataloader Workers 4
Predict with Generate True
Load Best Model at End True
Metric for Best Model eval_loss
Greater is Better False (Lower validation loss is better)
Report To Weights & Biases (WandB)
Other considerations
  • The model was fine tuned on the XSum dataset, which consists of BBC articles. Its performance may vary on other domains or types of text. The model may inherit biases present in the XSum dataset, which consists of BBC articles.
  • The model generates summaries based on patterns learned during training. It may occasionally produce inaccurate or misleading summaries, especially for complex or ambiguous input text.
  • The model may struggle with highly technical or domain-specific content, as it was not explicitly trained on such data.
  • The model generates summaries in English only.

Usage

Below is an example of how to load and use the model:

from transformers import pipeline

# Load the few-shot model
summarizer = pipeline("summarization", model="bhargavis/fulltrain-xsum-bart")

# Provide input text
input_text = """
Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
"""

# Generate summary
summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
print(summary[0]["summary_text"])
Downloads last month
19
Safetensors
Model size
406M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for bhargavis/fulltrain-xsum-bart

Finetuned
(151)
this model

Dataset used to train bhargavis/fulltrain-xsum-bart