putazon commited on
Commit
09813cd
·
verified ·
1 Parent(s): a1a4713

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -69
README.md CHANGED
@@ -1,69 +1,78 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- base_model: bert-base-cased
5
- tags:
6
- - generated_from_trainer
7
- metrics:
8
- - precision
9
- - recall
10
- - f1
11
- - accuracy
12
- model-index:
13
- - name: bert-finetuned-ner
14
- results: []
15
- ---
16
-
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
19
-
20
- # bert-finetuned-ner
21
-
22
- This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on an unknown dataset.
23
- It achieves the following results on the evaluation set:
24
- - Loss: 0.0005
25
- - Precision: 0.9999
26
- - Recall: 0.9999
27
- - F1: 0.9999
28
- - Accuracy: 0.9999
29
-
30
- ## Model description
31
-
32
- More information needed
33
-
34
- ## Intended uses & limitations
35
-
36
- More information needed
37
-
38
- ## Training and evaluation data
39
-
40
- More information needed
41
-
42
- ## Training procedure
43
-
44
- ### Training hyperparameters
45
-
46
- The following hyperparameters were used during training:
47
- - learning_rate: 2e-05
48
- - train_batch_size: 8
49
- - eval_batch_size: 8
50
- - seed: 42
51
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
- - lr_scheduler_type: linear
53
- - num_epochs: 3
54
-
55
- ### Training results
56
-
57
- | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
58
- |:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
59
- | 0.0011 | 1.0 | 12867 | 0.0009 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
60
- | 0.002 | 2.0 | 25734 | 0.0004 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
61
- | 0.0005 | 3.0 | 38601 | 0.0005 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
62
-
63
-
64
- ### Framework versions
65
-
66
- - Transformers 4.48.1
67
- - Pytorch 2.5.1+cu124
68
- - Datasets 3.2.0
69
- - Tokenizers 0.21.0
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: mit
4
+ base_model: bert-base-cased
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - precision
9
+ - recall
10
+ - f1
11
+ - accuracy
12
+ model-index:
13
+ - name: searchqueryner-be
14
+ results: []
15
+ datasets:
16
+ - putazon/searchqueryner-100k
17
+ language:
18
+ - en
19
+ - es
20
+ pipeline_tag: token-classification
21
+ ---
22
+
23
+ # bert-finetuned-ner
24
+
25
+ This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the [SearchQueryNER-100k](https://huggingface.co/datasets/putazon/searchqueryner-100k) dataset. It achieves the following results on the evaluation set:
26
+ - Loss: 0.0005
27
+ - Precision: 0.9999
28
+ - Recall: 0.9999
29
+ - F1: 0.9999
30
+ - Accuracy: 0.9999
31
+
32
+ ## Model description
33
+
34
+ This model has been fine-tuned for Named Entity Recognition (NER) tasks on search queries, making it particularly effective for understanding user intent and extracting structured entities from short texts. The training leveraged the SearchQueryNER-100k dataset, which contains 13 entity types.
35
+
36
+ ## Intended uses & limitations
37
+
38
+ ### Intended uses:
39
+ - Extracting named entities such as locations, professions, and attributes from user search queries.
40
+ - Optimizing search engines by improving query understanding.
41
+
42
+ ### Limitations:
43
+ - The model may not generalize well to domains outside of search queries.
44
+
45
+ ## Training and evaluation data
46
+
47
+ The training and evaluation data were sourced from the [SearchQueryNER-100k](https://huggingface.co/putazon/searchqueryner-100k) dataset. The dataset includes tokenized search queries annotated with 13 entity types, divided into training, validation, and test sets:
48
+ - **Training set:** 102,931 examples
49
+ - **Validation set:** 20,420 examples
50
+ - **Test set:** 20,301 examples
51
+
52
+ ## Training procedure
53
+
54
+ ### Training hyperparameters
55
+
56
+ The following hyperparameters were used during training:
57
+ - learning_rate: 2e-05
58
+ - train_batch_size: 8
59
+ - eval_batch_size: 8
60
+ - seed: 42
61
+ - optimizer: ADAMW_TORCH with betas=(0.9,0.999), epsilon=1e-08
62
+ - lr_scheduler_type: linear
63
+ - num_epochs: 3
64
+
65
+ ### Training results
66
+
67
+ | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
68
+ |:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:|
69
+ | 0.0011 | 1.0 | 12867 | 0.0009 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
70
+ | 0.002 | 2.0 | 25734 | 0.0004 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
71
+ | 0.0005 | 3.0 | 38601 | 0.0005 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
72
+
73
+ ### Framework versions
74
+
75
+ - Transformers 4.48.1
76
+ - Pytorch 2.5.1+cu124
77
+ - Datasets 3.2.0
78
+ - Tokenizers 0.21.0