File size: 10,817 Bytes
6515d04 685c2df 6515d04 685c2df 6515d04 685c2df 6515d04 685c2df 6515d04 685c2df 6515d04 685c2df 6515d04 d32f7fe 6515d04 685c2df d32f7fe 685c2df d32f7fe 685c2df 6515d04 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
---
base_model: mini1013/master_domain
library_name: setfit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget:
- text: 이브로쉐 모링가 리프레시 헤어 식초 400ml 1개 옵션없음 주식회사 다올연구소
- text: Hair Identifier Spray for Face Shaving 2024 Skin Dermaplaning Moisturizing
and Care Dermaplaner 2 PC 옵션없음 젠틀스토어
- text: 수앤 오리진 블랙 단백질샴푸700ml,4개 옵션없음 다부자
- text: 클로란 퀴닌 에델바이스 두피 세럼 100ml 옵션없음 스루치로 유한책임회사
- text: 이브로쉐 리프레쉬 헤어식초(모링가) 400ml 옵션없음 스루치로 유한책임회사
inference: true
model-index:
- name: SetFit with mini1013/master_domain
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Unknown
type: unknown
split: test
metrics:
- type: accuracy
value: 0.6042402826855123
name: Accuracy
---
# SetFit with mini1013/master_domain
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [mini1013/master_domain](https://huggingface.co/mini1013/master_domain) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned Sentence Transformer.
## Model Details
### Model Description
- **Model Type:** SetFit
- **Sentence Transformer body:** [mini1013/master_domain](https://huggingface.co/mini1013/master_domain)
- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
- **Maximum Sequence Length:** 512 tokens
- **Number of Classes:** 8 classes
<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
### Model Labels
| Label | Examples |
|:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 6.0 | <ul><li>'CHI 실크 인퓨전 12 Fl oz (관부가세포함) 옵션없음 제이글로벌컴퍼니'</li><li>'아모스 리페어 샤인 모이스트 에센스 100ml 옵션없음 티비'</li><li>'BAO H LAB Hair Loss Care Ampoule 바오에이치랩 탈모케어앰플 옵션없음 주식회사 바오젠'</li></ul> |
| 7.0 | <ul><li>'커리쉴 프레스티지 실키 3종 옵션없음 (주)커리쉴'</li><li>'미쟝센 퍼펙트 매직 스트레이트 샴푸&트리트먼트&세럼 3종 세트+트리트먼트 30ml 아모레퍼시픽'</li><li>'[르도암 공식]르도암 카멜리아 헤어 2종 세트(샴푸+트리트먼트) LEDOAM1935'</li></ul> |
| 0.0 | <ul><li>'실키드 검은콩 코팅 탈모펜슬™ / 머리숱앰플 두피앰플 산후탈모 서리태 비건 에센스 홈 1개 (1개월) 탈모펜슬™ 주식회사 팀오브라만차(Team of la mancha Corp.)'</li><li>'에버미라클 200ml EM 풀라무 토너 스칼프 토닉 8W98E7F225 옵션없음 파워몰'</li><li>'포티샤 모발강화 두피세럼 100ml/르네휘테르 옵션없음 롯데쇼핑(주)'</li></ul> |
| 4.0 | <ul><li>'[클렌징대전(클렌징밤 )] 로픈 바오밥 세라마이드LPP 프리미엄 헤어트리트먼트 베이비파우더향 1000g 옵션없음 (주)우신뷰티'</li><li>'허벌리스테 헤어 리페어세럼 150ml 1개 + 헤어 마스크 500ml - 1개 옵션없음 복슬강아지'</li><li>'[백화점 정품] 모로칸오일 오리지널 오일 트리트먼트 100ml 제3자 배송관련 개인정보활용에 동의함 버니버즈'</li></ul> |
| 2.0 | <ul><li>'헤드앤숄더 시트러스 레몬 샴푸 750ml 옵션없음 포에이치제이'</li><li>'아렌 일진 산성샴푸펌컬러 1000ml 옵션없음 해문인터내셔널'</li><li>'물없이쓰는샴푸 물없이머리감는 입원준비물 노워시 옵션없음 해피2데이'</li></ul> |
| 5.0 | <ul><li>'바이오테닉스 홈케어 매직헬프 바이-페이즈 리컨디셔너 60ml 비너스 클리닉 옵션없음 주식회사 위즈온컴퍼니'</li><li>'[바이레도] 블랑쉬 헤어퍼퓸 75ml 화이트_F 푸치코리아 유한책임회사'</li><li>'바이레도 집시 워터 헤어퍼퓸 75ml 백화점 상품 옵션없음 코코스팜'</li></ul> |
| 1.0 | <ul><li>'케라시스린스 퍼퓸 체리블라썸 1000ml 옵션없음 땡그리나'</li><li>'[갤러리아] [비건 NEW] 진저 스캘프 케어 대용량 컨디셔너 400ML(한화갤러리아㈜ 광교점) 옵션없음 한화갤러리아(주)'</li><li>'케라시스 스위트 앤 플라워리 퍼퓸 린스 1L 옵션없음 해피쭈몰'</li></ul> |
| 3.0 | <ul><li>'모비88 아데노신 특허등록 탈모토닉 볼륨업 비듬 제거 옵션없음 달이커머스'</li><li>'힐텀 어성초 맥주효모 토닉 120ml 옵션없음 현스 마켓'</li><li>'닥터포헤어 폴리젠 토닉 120ml x 2개 두피 영양공급 탈모증상완화 영양제 코스트코 옵션없음 또또상회'</li></ul> |
## Evaluation
### Metrics
| Label | Accuracy |
|:--------|:---------|
| **all** | 0.6042 |
## Uses
### Direct Use for Inference
First install the SetFit library:
```bash
pip install setfit
```
Then you can load this model and run inference.
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_bt12_test")
# Run inference
preds = model("수앤 오리진 블랙 단백질샴푸700ml,4개 옵션없음 다부자")
```
<!--
### Downstream Use
*List how someone could finetune this model on their own dataset.*
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Set Metrics
| Training set | Min | Median | Max |
|:-------------|:----|:-------|:----|
| Word count | 4 | 9.25 | 21 |
| Label | Training Sample Count |
|:------|:----------------------|
| 0.0 | 12 |
| 1.0 | 23 |
| 2.0 | 19 |
| 3.0 | 14 |
| 4.0 | 18 |
| 5.0 | 20 |
| 6.0 | 28 |
| 7.0 | 18 |
### Training Hyperparameters
- batch_size: (512, 512)
- num_epochs: (50, 50)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 60
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
### Training Results
| Epoch | Step | Training Loss | Validation Loss |
|:-------:|:----:|:-------------:|:---------------:|
| 0.0556 | 1 | 0.4865 | - |
| 2.7778 | 50 | 0.3392 | - |
| 5.5556 | 100 | 0.0584 | - |
| 8.3333 | 150 | 0.0087 | - |
| 11.1111 | 200 | 0.003 | - |
| 13.8889 | 250 | 0.0002 | - |
| 16.6667 | 300 | 0.0001 | - |
| 19.4444 | 350 | 0.0001 | - |
| 22.2222 | 400 | 0.0001 | - |
| 25.0 | 450 | 0.0001 | - |
| 27.7778 | 500 | 0.0001 | - |
| 30.5556 | 550 | 0.0 | - |
| 33.3333 | 600 | 0.0 | - |
| 36.1111 | 650 | 0.0 | - |
| 38.8889 | 700 | 0.0 | - |
| 41.6667 | 750 | 0.0 | - |
| 44.4444 | 800 | 0.0 | - |
| 47.2222 | 850 | 0.0 | - |
| 50.0 | 900 | 0.0 | - |
### Framework Versions
- Python: 3.10.12
- SetFit: 1.1.0
- Sentence Transformers: 3.3.1
- Transformers: 4.44.2
- PyTorch: 2.2.0a0+81ea7a4
- Datasets: 3.2.0
- Tokenizers: 0.19.1
## Citation
### BibTeX
```bibtex
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |