knowledgator
/

gliclass-modern-large-v2.0-init

@@ -1,199 +1,91 @@
 ---
-library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+license: apache-2.0
+datasets:
+- MoritzLaurer/synthetic_zeroshot_mixtral_v0.1
+- knowledgator/gliclass-v1.0
+- fancyzhx/amazon_polarity
+- cnmoro/QuestionClassification
+- Arsive/toxicity_classification_jigsaw
+- shishir-dwi/News-Article-Categorization_IAB
+- SetFit/qnli
+- nyu-mll/multi_nli
+- SetFit/student-question-categories
+- SetFit/tweet_sentiment_extraction
+- SetFit/hate_speech18
+- saattrupdan/doc-nli
+language:
+- en
+- fr
+- ge
+metrics:
+- f1
+pipeline_tag: zero-shot-classification
+tags:
+- text classification
+- zero-shot
+- small language models
+- RAG
+- sentiment analysis
 ---
+# ⭐ GLiClass: Generalist and Lightweight Model for Sequence Classification
+This is an efficient zero-shot classifier inspired by [GLiNER](https://github.com/urchade/GLiNER/tree/main) work. It demonstrates the same performance as a cross-encoder while being more compute-efficient because classification is done at a single forward path.
+It can be used for `topic classification`, `sentiment analysis` and as a reranker in `RAG` pipelines.
+The model was trained on synthetic and licensed data that allow commercial use and can be used in commercial applications.
+This version of the model uses a layer-wise selection of features that enables a better understanding of different levels of language. The backbone model is [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large), which effectively processes long sequences.
+### How to use:
+First of all, you need to install GLiClass library:
+```bash
+pip install gliclass
+pip install -U transformers>=4.48.0
+```
+Than you need to initialize a model and a pipeline:
+```python
+from gliclass import GLiClassModel, ZeroShotClassificationPipeline
+from transformers import AutoTokenizer
+model = GLiClassModel.from_pretrained("knowledgator/gliclass-modern-large-v2.0-init")
+tokenizer = AutoTokenizer.from_pretrained("knowledgator/gliclass-modern-large-v2.0-init")
+pipeline = ZeroShotClassificationPipeline(model, tokenizer, classification_type='multi-label', device='cuda:0')
+text = "One day I will see the world!"
+labels = ["travel", "dreams", "sport", "science", "politics"]
+results = pipeline(text, labels, threshold=0.5)[0] #because we have one text
+for result in results:
+ print(result["label"], "=>", result["score"])
+```
+### Benchmarks:
+Below, you can see the F1 score on several text classification datasets. All tested models were not fine-tuned on those datasets and were tested in a zero-shot setting.
+| Model                       | IMDB | AG_NEWS | Emotions |
+|-----------------------------|------|---------|----------|
+| [gliclass-modern-large-v2.0-init (399 M)](knowledgator/gliclass-modern-large-v2.0-init) | 0.9137 | 0.7357  | 0.4140  |
+| [gliclass-modern-base-v2.0-init (151 M)](knowledgator/gliclass-modern-base-v2.0-init) | 0.8264 | 0.6637  | 0.2985  |
+| [gliclass-large-v1.0 (438 M)](https://huggingface.co/knowledgator/gliclass-large-v1.0) | 0.9404 | 0.7516  | 0.4874  |
+| [gliclass-base-v1.0 (186 M)](https://huggingface.co/knowledgator/gliclass-base-v1.0) | 0.8650 | 0.6837  | 0.4749  |
+| [gliclass-small-v1.0 (144 M)](https://huggingface.co/knowledgator/gliclass-small-v1.0) | 0.8650 | 0.6805  | 0.4664   |
+| [Bart-large-mnli (407 M)](https://huggingface.co/facebook/bart-large-mnli)      | 0.89 | 0.6887  | 0.3765   |
+| [Deberta-base-v3 (184 M)](https://huggingface.co/cross-encoder/nli-deberta-v3-base)      | 0.85 | 0.6455  | 0.5095   |
+| [Comprehendo (184M)](https://huggingface.co/knowledgator/comprehend_it-base)           | 0.90 | 0.7982  | 0.5660   |
+| SetFit [BAAI/bge-small-en-v1.5 (33.4M)](https://huggingface.co/BAAI/bge-small-en-v1.5) | 0.86 | 0.5636 | 0.5754 |
+Below you can find a comparison with other GLiClass models:
+| Dataset              | gliclass-small-v1.0-lw | gliclass-base-v1.0-lw | gliclass-large-v1.0-lw | gliclass-small-v1.0 | gliclass-base-v1.0 | gliclass-large-v1.0 | gliclass-modern-base-v2.0-init | gliclass-modern-large-v2.0-init |
+|----------------------|-----------------------|-----------------------|-----------------------|---------------------|---------------------|---------------------|---------------------|---------------------|
+| CR                   | 0.8886                | 0.9097                | 0.9226                | 0.8824              | 0.8942              | 0.9219              | 0.9041              | 0.8980              |
+| sst2                 | 0.8392                | 0.8987                | 0.9247                | 0.8518              | 0.8979              | 0.9269              | 0.9011              | 0.9434              |
+| sst5                 | 0.2865                | 0.3779                | 0.2891                | 0.2424              | 0.2789              | 0.3900              | 0.1972              | 0.1123              |
+| 20_news_groups       | 0.4572                | 0.3953                | 0.4083                | 0.3366              | 0.3576              | 0.3863              | 0.2448              | 0.2792              |
+| spam                 | 0.5118                | 0.5126                | 0.3642                | 0.4089              | 0.4938              | 0.3661              | 0.5074              | 0.6364              |
+| rotten_tomatoes      | 0.8015                | 0.8429                | 0.8807                | 0.7987              | 0.8508              | 0.8808              | 0.6630              | 0.5928              |
+| financial_phrasebank | 0.8665                | 0.8880                | 0.9044                | 0.8901              | 0.8955              | 0.8735              | 0.2537              | 0.2562              |
+| imdb                 | 0.9048                | 0.9351                | 0.9429                | 0.8982              | 0.9238              | 0.9333              | 0.8255              | 0.9137              |
+| ag_news              | 0.7252                | 0.6985                | 0.7559                | 0.7242              | 0.6848              | 0.7503              | 0.6050              | 0.6933              |
+| dair_emotion         | 0.4012                | 0.3516                | 0.3951                | 0.3450              | 0.2357              | 0.4013              | 0.2474              | 0.3746              |
+| capsotu              | 0.3794                | 0.4643                | 0.4749                | 0.3432              | 0.4375              | 0.4644              | 0.2929              | 0.2919              |
+| **Average:**         | 0.5732                | 0.6183                | 0.6165                | 0.5401              | 0.5571              | 0.6078              | 0.5129              | 0.5447              |