|
|
|
--- |
|
tags: |
|
- bertopic |
|
library_name: bertopic |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# bert_key |
|
|
|
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. |
|
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. |
|
|
|
## Usage |
|
|
|
To use this model, please install BERTopic: |
|
|
|
``` |
|
pip install -U bertopic |
|
``` |
|
|
|
You can use the model as follows: |
|
|
|
```python |
|
from bertopic import BERTopic |
|
topic_model = BERTopic.load("VegetaSama/bert_key") |
|
|
|
topic_model.get_topic_info() |
|
``` |
|
|
|
## Topic overview |
|
|
|
* Number of topics: 17 |
|
* Number of training documents: 10000 |
|
|
|
<details> |
|
<summary>Click here for an overview of all topics.</summary> |
|
|
|
| Topic ID | Topic Keywords | Topic Frequency | Label | |
|
|----------|----------------|-----------------|-------| |
|
| -1 | restaurant - meal - sandwich - food - lunch | 86 | -1_restaurant_meal_sandwich_food | |
|
| 0 | restaurant - drinks - dinner - bar - steak | 2059 | 0_restaurant_drinks_dinner_bar | |
|
| 1 | mexican food - tacos - taco - chips salsa - salsa | 2789 | 1_mexican food_tacos_taco_chips salsa | |
|
| 2 | shop - shopping - nordstrom - store - customer service | 731 | 2_shop_shopping_nordstrom_store | |
|
| 3 | thai food - chinese food - pad thai - thai - fried rice | 701 | 3_thai food_chinese food_pad thai_thai | |
|
| 4 | best pizza - pizza good - good pizza - pizza - pizzeria | 594 | 4_best pizza_pizza good_good pizza_pizza | |
|
| 5 | scottsdale - phoenix - restaurant - bbq - arizona | 586 | 5_scottsdale_phoenix_restaurant_bbq | |
|
| 6 | burger - good burger - burgers - burger fries - restaurant | 443 | 6_burger_good burger_burgers_burger fries | |
|
| 7 | restaurant - hostess - dinner - waiter - waitress | 354 | 7_restaurant_hostess_dinner_waiter | |
|
| 8 | best sushi - sushi - sushi place - sushi bar - spicy tuna | 321 | 8_best sushi_sushi_sushi place_sushi bar | |
|
| 9 | manicure - massage - pedicure - salon - nail | 294 | 9_manicure_massage_pedicure_salon | |
|
| 10 | hotels - hotel - resort - marriott - amenities | 288 | 10_hotels_hotel_resort_marriott | |
|
| 11 | coffee shop - coffee - starbucks - coffee shops - good coffee | 215 | 11_coffee shop_coffee_starbucks_coffee shops | |
|
| 12 | breakfast - pancakes - protein pancakes - bakery - lunch | 211 | 12_breakfast_pancakes_protein pancakes_bakery | |
|
| 13 | hike - hiking - trails - trail - south mountain | 135 | 13_hike_hiking_trails_trail | |
|
| 14 | downtown phoenix - central phoenix - restaurants - phoenix area - phoenix | 105 | 14_downtown phoenix_central phoenix_restaurants_phoenix area | |
|
| 15 | vets - vet - veterinary - pets - petsmart | 88 | 15_vets_vet_veterinary_pets | |
|
|
|
</details> |
|
|
|
## Training hyperparameters |
|
|
|
* calculate_probabilities: True |
|
* language: None |
|
* low_memory: False |
|
* min_topic_size: 10 |
|
* n_gram_range: (1, 1) |
|
* nr_topics: None |
|
* seed_topic_list: None |
|
* top_n_words: 5 |
|
* verbose: True |
|
* zeroshot_min_similarity: 0.7 |
|
* zeroshot_topic_list: None |
|
|
|
## Framework versions |
|
|
|
* Numpy: 1.24.3 |
|
* HDBSCAN: 0.8.33 |
|
* UMAP: 0.5.5 |
|
* Pandas: 2.0.3 |
|
* Scikit-Learn: 1.3.0 |
|
* Sentence-transformers: 2.2.2 |
|
* Transformers: 4.32.1 |
|
* Numba: 0.58.1 |
|
* Plotly: 5.9.0 |
|
* Python: 3.11.5 |
|
|