File size: 3,385 Bytes
569994c e1e757c 569994c e1e757c 569994c 1cee2bb 569994c e1e757c 1cee2bb 569994c 9a7f8c8 1cee2bb e1e757c 9a7f8c8 e1e757c 569994c 1cee2bb 569994c 1cee2bb e1e757c 1cee2bb 569994c e1e757c 569994c e1e757c 569994c 1cee2bb 569994c 1cee2bb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
---
library_name: transformers
tags:
- lecture
- college
- university
- summarization
license: mit
language:
- en
metrics:
- rouge
pipeline_tag: summarization
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Academ is a fine-tuned BART model for summarizing academic lectures.
To find out how the model was fine-tuned, you can check the notebook on Kaggle: https://www.kaggle.com/code/yousefr/college-lectures-summarization-bart-unsupervised/
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Developed by:** Yousef Gamaleldin
- **Model type:** Summarization
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model [optional]:** BART Large CNN
## How to Get Started with the Model
Use the code below to get started with the model.
```
from transformers import BartForConditionalGeneration, AutoTokenizer
model = BartForConditionalGeneration.from_pretrained('yousefg/Academ-0.5')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-cnn')
def get_summary(input_ids, attention_mask, context_length):
summaries = []
for i in range(0, input_ids.shape[1], context_length):
input_slice = input_ids[:, i:i + context_length] if i + context_length <= input_ids.size(1) else input_ids[:, i:]
attention_mask_slice = attention_mask[:, i:i + context_length] if i + context_length <= attention_mask.size(1) else attention_mask[:, i:]
summary = model.generate(input_slice, attention_mask = attention_mask_slice, max_new_tokens = 1654, min_new_tokens = 250, do_sample = True, renormalize_logits = True)
summaries.extend(summary[0].tolist())
summaries = tokenizer.decode(summaries, skip_special_tokens = True)
return summaries
batch = tokenizer(texts, truncation = False) # make sure to get the transcript from the lecture
input_ids = torch.tensor(batch['input_ids']).unsqueeze(0).to(device)
attention_mask = torch.tensor(batch['attention_mask']).unsqueeze(0).to(device)
summary = get_summary(input_ids, attention_mask, 1654)
print(summary)
```
## Training Details
The model's training used a custom loss function for getting the model into an optimal length (35% chosen as the optimal length).
#### Training Hyperparameters
- **Training regime:** bf16 non-mixed precision<!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
- **Learning Rate:** 0.001
- **Weight Decay:** 0.01
- **Epochs:** 4
- **Batch Size:** 16
-
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
The evaluation is based on ROUGE 1 with a change of discounting padding tokens.
#### Testing Data
The model's test dataset had 289 lectures, mainly from MIT OpenCourseWare.
<!-- This should link to a Dataset Card if possible. -->
### Results
The model achieved 96% accuracy for ROUGUE 1 in the test dataset, and 93% in the evaluation dataset.
#### Summary
Academ is a summarization model trained on 2307 lectures, mainly from MIT OpenCourseWare. The model has a max sequence length of 1654, increasing 630 tokens from the original model. |