carbonnnnn
/

T2L1DISTILBERT

Text Classification

Inference Endpoints

Model card Files Files and versions Community

carbonnnnn commited on Jun 10, 2023

Commit

a6250b2

·

1 Parent(s): 9d81a58

Create README.md

Files changed (1) hide show

README.md +64 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+language: en
+tags:
+- exbert
+license: apache-2.0
+datasets:
+- bookcorpus
+- wikipedia
+---
+# Finetuned DistilBERT
+This model is a distilled version of the [BERT base model](https://huggingface.co/bert-base-uncased). It was
+introduced in [this paper](https://arxiv.org/abs/1910.01108). The code for the distillation process can be found
+[here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is uncased: it does
+not make a difference between english and English.
+## Model description
+DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a
+self-supervised fashion, using the BERT base model as a teacher. This model is further finetuned on the DB_PEDIA Dataset which can be found
+[here](https://huggingface.co/datasets/DeveloperOats/DBPedia_Classes). This dataset consists of 342,782 Wikipedia articles that have been cleaned and classified into hierarchical classes.
+The classification system spans three levels, with 9 classes at the first level, 70 classes at the second level,
+and 219 classes at the third level.
+## Intended uses & limitations
+You can use the model to extract structured content and organizing it into taxonomic categories.
+### How to use
+You can use this model directly with a pipeline:
+from transformers import pipeline
+import numpy as np
+text = "This was a masterpiece. Not completely faithful to the books, but enthralling from beginning to end. Might be my favorite of the three."
+classifier = pipeline("text-classification", model="carbonnnnn/T2L1DISTILBERT")
+labeltxt = np.loadtxt("TASK2/label_vals/l1.txt", dtype="str")
+labelint = ['LABEL_0', 'LABEL_1', 'LABEL_2', 'LABEL_3', 'LABEL_4', 'LABEL_5', 'LABEL_6', 'LABEL_7', 'LABEL_8']
+output = classifier(text)[0]['label']
+for i in range(len(labelint)):
+    if output == labelint[i]:
+      print("Output is : " + str(labeltxt[i]))
+### Limitations and bias
+Even if the training data used for this model could be characterized as fairly neutral, this model can have biased
+predictions. It also inherits some of
+[the bias of its teacher model](https://huggingface.co/bert-base-uncased#limitations-and-bias).
+## Evaluation results
+### BibTeX entry and citation info
+<a href="https://huggingface.co/exbert/?model=distilbert-base-uncased">
+	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
+</a>