AyoubChLin commited on
Commit
a3ee01f
·
1 Parent(s): 904c2a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md CHANGED
@@ -1,3 +1,58 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - AyoubChLin/CNN_News_Articles_2011-2022
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ pipeline_tag: text-classification
10
  ---
11
+ ## DistilBertForSequenceClassification on CNN News Dataset
12
+
13
+ This repository contains a fine-tuned DistilBert base model for sequence classification on the CNN News dataset. The model is able to classify news articles into one of six categories: business, entertainment, health, news, politics, and sport.
14
+
15
+ The model was fine-tuned for four epochs, achieving a training loss of 0.052900, a validation loss of 0.257164,
16
+ and a validation accuracy of 0.960415.
17
+
18
+
19
+ ### Model Description
20
+
21
+ <!-- Provide a longer summary of what this model is. -->
22
+
23
+
24
+
25
+ - **Developed by:** [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/) & [BOUBEKRI Faycal](https://www.linkedin.com/in/faycal-boubekri-832848199/)
26
+ - **Shared by [optional]:** HuggingFace
27
+ - **Model type:** Language model
28
+ - **Language(s) (NLP):** en
29
+ - **Finetuned from model [optional]:** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)
30
+
31
+
32
+ # Usage
33
+
34
+ You can use this model with the Hugging Face Transformers library for a variety of natural language processing tasks, such as text classification, sentiment analysis, and more.
35
+
36
+ Here's an example of how to use this model for text classification in Python:
37
+
38
+ ``` python
39
+ from transformers import AutoTokenizer, DistilBertForSequenceClassification
40
+
41
+
42
+ model_name = "AyoubChLin/distilbert_cnn_news"
43
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
44
+ model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
45
+
46
+ text = "This is a news article about politics."
47
+ inputs = tokenizer(text, padding=True, truncation=True, return_tensors="pt")
48
+
49
+ with torch.no_grad():
50
+ logits = model(**inputs).logits
51
+
52
+ predicted_class_id = logits.argmax().item()
53
+
54
+ ```
55
+ In this example, we first load the tokenizer and the model using their respective from_pretrained methods. We then encode a news article using the tokenizer, pass the inputs through the model, and extract the predicted label using the argmax function. Finally, we map the predicted label to its corresponding category using a list of labels.
56
+
57
+ Contributors
58
+ This model was fine-tuned by CHERGUELAINE Ayoub and BOUBEKRI Faycal.