Update README.md
Browse files
README.md
CHANGED
@@ -32,16 +32,18 @@ This is a News Categorisation model for Setswana.
|
|
32 |
|
33 |
### News Categories
|
34 |
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
|
|
|
|
45 |
|
46 |
### Model Performance
|
47 |
|
@@ -51,8 +53,8 @@ Performance of models on Daily News Dikgang dataset
|
|
51 |
|-----------------------------|--------------------------------------|-------------------|
|
52 |
| Logistic Regression + TFIDF | 60.1 | 56.2 |
|
53 |
| NCHLT TSN RoBERTa | 64.7 | 60.3 |
|
54 |
-
| PuoBERTa | 63.8
|
55 |
-
| PuoBERTaJW300 |
|
56 |
|
57 |
### Usage
|
58 |
|
|
|
32 |
|
33 |
### News Categories
|
34 |
|
35 |
+
We use the IPTC news codes [https://iptc.org/standards/newscodes/](https://iptc.org/standards/newscodes/)
|
36 |
+
|
37 |
+
0. arts_culture_entertainment_and_media (Botsweretshi, setso, boitapoloso le bobegakgang)
|
38 |
+
1. crime_law_and_justice (Bosenyi, molao le bosiamisi)
|
39 |
+
2. disaster_accident_and_emergency_incident (Masetlapelo, kotsi le tiragalo ya maemo a tshoganyetso)
|
40 |
+
3. economy_business_and_finance (Ikonomi, tsa kgwebo le tsa ditšhelete)
|
41 |
+
4. education (Thuto)
|
42 |
+
5. environment (Tikologo)
|
43 |
+
6. health (Boitekanelo)
|
44 |
+
7. politics (Dipolotiki)
|
45 |
+
8. religion_and_belief (Bodumedi le tumelo)
|
46 |
+
9. society (Setšhaba)
|
47 |
|
48 |
### Model Performance
|
49 |
|
|
|
53 |
|-----------------------------|--------------------------------------|-------------------|
|
54 |
| Logistic Regression + TFIDF | 60.1 | 56.2 |
|
55 |
| NCHLT TSN RoBERTa | 64.7 | 60.3 |
|
56 |
+
| PuoBERTa | **63.8** | **62.9** |
|
57 |
+
| PuoBERTaJW300 | 66.2 | *65.4* |
|
58 |
|
59 |
### Usage
|
60 |
|