|
--- |
|
library_name: transformers |
|
tags: |
|
- lecture |
|
- college |
|
- university |
|
- summarization |
|
license: mit |
|
language: |
|
- en |
|
metrics: |
|
- rouge |
|
pipeline_tag: summarization |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
Academ is a fine-tuned BART model for summarizing academic lectures. |
|
|
|
To find out how the model was fine-tuned, you can check the notebook on Kaggle: https://www.kaggle.com/code/yousefr/college-lectures-summarization-bart-unsupervised/ |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** Yousef Gamaleldin |
|
- **Model type:** Summarization |
|
- **Language(s) (NLP):** English |
|
- **License:** MIT |
|
- **Finetuned from model [optional]:** BART Large CNN |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
``` |
|
from transformers import BartForConditionalGeneration, AutoTokenizer |
|
|
|
model = BartForConditionalGeneration.from_pretrained('yousefg/Academ-0.5') |
|
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-cnn') |
|
|
|
def get_summary(input_ids, attention_mask, context_length): |
|
|
|
summaries = [] |
|
for i in range(0, input_ids.shape[1], context_length): |
|
|
|
input_slice = input_ids[:, i:i + context_length] if i + context_length <= input_ids.size(1) else input_ids[:, i:] |
|
attention_mask_slice = attention_mask[:, i:i + context_length] if i + context_length <= attention_mask.size(1) else attention_mask[:, i:] |
|
|
|
summary = model.generate(input_slice, attention_mask = attention_mask_slice, max_new_tokens = 1654, min_new_tokens = 250, do_sample = True, renormalize_logits = True) |
|
summaries.extend(summary[0].tolist()) |
|
|
|
summaries = tokenizer.decode(summaries, skip_special_tokens = True) |
|
|
|
return summaries |
|
|
|
batch = tokenizer(texts, truncation = False) # make sure to get the transcript from the lecture |
|
|
|
input_ids = torch.tensor(batch['input_ids']).unsqueeze(0).to(device) |
|
attention_mask = torch.tensor(batch['attention_mask']).unsqueeze(0).to(device) |
|
|
|
summary = get_summary(input_ids, attention_mask, 1654) |
|
print(summary) |
|
``` |
|
|
|
## Training Details |
|
|
|
The model's training used a custom loss function for getting the model into an optimal length (35% chosen as the optimal length). |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** bf16 non-mixed precision<!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
- **Learning Rate:** 0.001 |
|
- **Weight Decay:** 0.01 |
|
- **Epochs:** 4 |
|
- **Batch Size:** 16 |
|
- |
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
The evaluation is based on ROUGE 1 with a change of discounting padding tokens. |
|
|
|
#### Testing Data |
|
|
|
The model's test dataset had 289 lectures, mainly from MIT OpenCourseWare. |
|
<!-- This should link to a Dataset Card if possible. --> |
|
|
|
### Results |
|
|
|
The model achieved 96% accuracy for ROUGUE 1 in the test dataset, and 93% in the evaluation dataset. |
|
|
|
#### Summary |
|
Academ is a summarization model trained on 2307 lectures, mainly from MIT OpenCourseWare. The model has a max sequence length of 1654, increasing 630 tokens from the original model. |