IsmaelMousa
commited on
update model card
Browse files
README.md
CHANGED
@@ -1,3 +1,66 @@
|
|
1 |
-
---
|
2 |
-
license:
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- ar
|
5 |
+
metrics:
|
6 |
+
- accuracy
|
7 |
+
pipeline_tag: summarization
|
8 |
+
library_name: PyTorch
|
9 |
+
tags:
|
10 |
+
- PyTorch
|
11 |
+
- Arabic
|
12 |
+
- Abstractive-Summarization
|
13 |
+
- 174M
|
14 |
+
- Scratch
|
15 |
+
- Base
|
16 |
+
---
|
17 |
+
# Arab Bart
|
18 |
+
Implemented the [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
|
19 |
+
](https://arxiv.org/abs/1910.13461) paper from scratch using `PyTorch` for an abstractive summarization task in Arabic.
|
20 |
+
|
21 |
+
## Goal
|
22 |
+
|
23 |
+
Reproduce the BART model from scratch to understand its architecture in depth, using the minimum available resources.
|
24 |
+
|
25 |
+
## Size
|
26 |
+
|
27 |
+
The model size: `174M parameters`.
|
28 |
+
|
29 |
+
## Task
|
30 |
+
Abstractive Summarization in Arabic.
|
31 |
+
|
32 |
+
## Data
|
33 |
+
The dataset used is the [XL-Sum(Arabic Subset)](https://github.com/csebuetnlp/xl-sum?tab=readme-ov-file#:~:text=Arabic,Download) dataset. I chose this dataset because it's well-suited for our task. Additionally, it's written in pure Arabic, which makes it the best choice. The original source: [BBC Arabic](https://www.bbc.com/arabic).
|
34 |
+
|
35 |
+
- Features (columns):
|
36 |
+
- text: the full text (source sequences).
|
37 |
+
- summary: the summary of the text (target sequences).
|
38 |
+
|
39 |
+
- Size:
|
40 |
+
- train: `32,473 rows`.
|
41 |
+
- validation: `4689 rows`.
|
42 |
+
- test: `4689 rows`.
|
43 |
+
|
44 |
+
## Summary
|
45 |
+
|
46 |
+
## Results
|
47 |
+
|
48 |
+
| Epoch | Loss(train) | Loss(validation) | Epoch Time (hours) | Training Time (hours) | Device |
|
49 |
+
|:-----:|:-----------:|:----------------:|:------------------:|:---------------------:|:--------:|
|
50 |
+
| 1 | 10.03 | 9.72 | 0.23 | 1.1 | 1 x L4OS |
|
51 |
+
| 2 | 9.61 | 9.44 | 0.22 | 1.1 | 1 x L4OS |
|
52 |
+
| 3 | 9.36 | 9.22 | 0.22 | 1.1 | 1 x L4OS |
|
53 |
+
| 4 | 9.16 | 9.05 | 0.22 | 1.1 | 1 x L4OS |
|
54 |
+
| 5 | 9.01 | 8.92 | 0.22 | 1.1 | 1 x L4OS |
|
55 |
+
|
56 |
+
|
57 |
+
|
58 |
+
## Usage
|
59 |
+
|
60 |
+
```python
|
61 |
+
|
62 |
+
```
|
63 |
+
|
64 |
+
## License
|
65 |
+
|
66 |
+
This model is licensed under the `MIT` License.
|