File size: 2,226 Bytes
19e9262
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55947c4
 
 
 
 
 
19e9262
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license: mit
language:
- ar
metrics:
- accuracy
pipeline_tag: summarization
library_name: PyTorch
tags:
- PyTorch
- Arabic
- Abstractive-Summarization
- 174M
- Scratch
- Base
---
# Arab Bart
Implemented the [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
](https://arxiv.org/abs/1910.13461) paper from scratch using `PyTorch` for an abstractive summarization task in Arabic.

>[!IMPORTANT]
> The model inferenc is not ready, i mean you can't loading it directly from the `Transformers` library.
>
> As soon as possible i will create an inference API, and integrate the model with the Transformers library.
>

## Goal

Reproduce the BART model from scratch to understand its architecture in depth, using the minimum available resources.

## Size

The model size: `174M parameters`.

## Task
Abstractive Summarization in Arabic. 

## Data
The dataset used is the [XL-Sum(Arabic Subset)](https://github.com/csebuetnlp/xl-sum?tab=readme-ov-file#:~:text=Arabic,Download) dataset. I chose this dataset because it's well-suited for our task. Additionally, it's written in pure Arabic, which makes it the best choice. The original source: [BBC Arabic](https://www.bbc.com/arabic).

- Features (columns):
  - text: the full text (source sequences).
  - summary: the summary of the text (target sequences).
    
- Size:
  - train: `32,473 rows`.
  - validation: `4689 rows`.
  - test: `4689 rows`.

## Results

| Epoch | Loss(train) | Loss(validation) | Epoch Time (hours) | Training Time (hours) |  Device  |
|:-----:|:-----------:|:----------------:|:------------------:|:---------------------:|:--------:|
|   1   |    10.03    |       9.72       |        0.23        |          1.1          | 1 x L4OS |
|   2   |    9.61     |       9.44       |        0.22        |          1.1          | 1 x L4OS |
|   3   |    9.36     |       9.22       |        0.22        |          1.1          | 1 x L4OS |
|   4   |    9.16     |       9.05       |        0.22        |          1.1          | 1 x L4OS |
|   5   |    9.01     |       8.92       |        0.22        |          1.1          | 1 x L4OS |


## License

This model is licensed under the `MIT` License.