# Abstractive Summarization with Hugging Face Transformers


**Description:** Training T5 using Hugging Face Transformers for Abstractive Summarization.

## Introduction

Automatic summarization is one of the central problems in
Natural Language Processing (NLP). It poses several challenges relating to language
understanding (e.g. identifying important content)
and generation (e.g. aggregating and rewording the identified content into a summary).

In this tutorial, we tackle the single-document summarization task
with an abstractive modeling approach. The primary idea here is to generate a short,
single-sentence news summary answering the question “What is the news article about?”.
This approach to summarization is also known as *Abstractive Summarization* and has
seen growing interest among researchers in various disciplines.

Following prior work, we aim to tackle this problem using a
sequence-to-sequence model. [Text-to-Text Transfer Transformer (`T5`)](https://arxiv.org/abs/1910.10683)
is a [Transformer-based](https://arxiv.org/abs/1706.03762) model built on the encoder-decoder
architecture, pretrained on a multi-task mixture of unsupervised and supervised tasks where each task
is converted into a text-to-text format. T5 shows impressive results in a variety of sequence-to-sequence
(sequence in this notebook refers to text) like summarization, translation, etc.

In this notebook, we will fine-tune the pretrained T5 on the Abstractive Summarization
task using Hugging Face Transformers on the `XSum` dataset loaded from Hugging Face Datasets.

## Setup

### Installing the requirements

In [1]:
!pip install transformers==4.20.0
!pip install keras_nlp==0.3.0
!pip install datasets
!pip install huggingface-hub
!pip install nltk
!pip install rouge-score

Collecting transformers==4.20.0
 Downloading transformers-4.20.0-py3-none-any.whl (4.4 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.4/4.4 MB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.1.0 (from transformers==4.20.0)
 Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m25.4 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1 (from transformers==4.20.0)
 Downloading tokenizers-0.12.1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.6 MB)
[2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.6/6.6 MB[0m [31m34.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.16.4 tokenizers-0.12.1 transformers-4.20.0
Collecting keras_nlp==0.3.0
 Downloading keras_nlp-0.3.0-py3-none-any.whl (142 kB)
[2K [90m━━━━━━━━━━━

### Importing the necessary libraries

In [2]:
import os
import logging

import nltk
import numpy as np
import tensorflow as tf
from tensorflow import keras

# Only log error messages
tf.get_logger().setLevel(logging.ERROR)

os.environ["TOKENIZERS_PARALLELISM"] = "false"

### Define certain variables

In [3]:
# The percentage of the dataset you want to split as train and test
TRAIN_TEST_SPLIT = 0.1

MAX_INPUT_LENGTH = 1024 # Maximum length of the input to the model
MIN_TARGET_LENGTH = 5 # Minimum length of the output by the model
MAX_TARGET_LENGTH = 128 # Maximum length of the output by the model
BATCH_SIZE = 8 # Batch-size for training our model
LEARNING_RATE = 2e-5 # Learning-rate for training our model
MAX_EPOCHS = 1 # Maximum number of epochs we will train the model for

# This notebook is built on the t5-small checkpoint from the Hugging Face Model Hub
MODEL_CHECKPOINT = "t5-small"

## Load the dataset


In [4]:
from transformers.keras_callbacks import KerasMetricCallback

metric_callback = KerasMetricCallback(
 metric_fn, eval_dataset=generation_dataset, predict_with_generate=True
)

callbacks = [metric_callback]

# For now we will use our test set as our validation_data
model.fit(
 train_dataset, validation_data=test_dataset, epochs=MAX_EPOCHS, callbacks=callbacks
)

Downloading builder script: 0%| | 0.00/5.76k [00:00

For best results, we recommend training the model for atleast 5 epochs on the entire
training dataset!

## Inference


In [18]:
from transformers import pipeline

summarizer = pipeline("summarization", model=model, tokenizer=tokenizer, framework="tf")

summarizer(
 raw_datasets["test"][0]["document"],
 min_length=MIN_TARGET_LENGTH,
 max_length=MAX_TARGET_LENGTH,
)

[{'summary_text': 'Scotland have lost their last nine games in the Six Nations competition in Rome, according to the Scottish forward.'}]

Now you can push this model to Hugging Face Model Hub and also share it with with all your friends,
family, favorite pets: they can all load it with the identifier
`"your-username/the-name-you-picked"` so for instance:

```python
model.push_to_hub("transformers-qa", organization="keras-io")
tokenizer.push_to_hub("transformers-qa", organization="keras-io")
```
And after you push your model this is how you can load it in the future!

```python
from transformers import TFAutoModelForSeq2SeqLM

model = TFAutoModelForSeq2SeqLM.from_pretrained("your-username/my-awesome-model")
```

In [22]:
model.push_to_hub("transformers-qa", organization="keras-io")
tokenizer.push_to_hub("transformers-qa", organization="keras-io")
from transformers import TFAutoModelForSeq2SeqLM

model = TFAutoModelForSeq2SeqLM.from_pretrained("https://huggingface.co/kunaltilaganji/Abstractive-Summarization-with-Transformers/tree/main")

ValueError: ignored

In [23]:
transformers-cli login

SyntaxError: ignored