liuyanyi's picture
Update README.md
9bbcfc6 verified
---
license: apache-2.0
language:
- en
library_name: gliner
pipeline_tag: token-classification
---
# GLiNER-Large (Reproduce) Model
This model is a reproduce version of GLiNER-large, the training hyperparameters are different from the original model.
# Hyperparameters
The detail of training hyperparameters can see in `deberta.yaml`.
Except for config in `deberta.yaml`, i manually set the `lr_scheduler_type` to `cosine_with_min_lr` and `lr_scheduler_kwargs` to `{"min_lr_rate": 0.01}` in `train.py`:
```
training_args = TrainingArguments(
...
lr_scheduler_type="cosine_with_min_lr",
lr_scheduler_kwargs={"min_lr_rate": 0.01},
...
)
```
NOTE: The result is not stable, i guess the random shuffle of the dataset is the reason.
# Weights
Here are two weights, one is the final model after 4k iterations, which has the best performance on the zero-shot evaluation, and the other is the model after full training.
| Model | link | AI | literature | music | politics | science | movie | restaurant | Average |
| :--------: | :-------------------------------------------------------------------: | :---: | :--------: | :---: | :------: | :-----: | :---: | :--------: | :-----: |
| iter_4000 | [πŸ€—](https://huggingface.co/liuyanyi/gliner_large_reproduce_iter_4000) | 56.7 | 65.1 | 69.6 | 74.2 | 60.9 | 60.6 | 39.7 | 61.0 |
| iter_10000 | [πŸ€—](https://huggingface.co/liuyanyi/gliner_large_reproduce) | 55.1 | 62.9 | 68.3 | 71.6 | 57.3 | 58.4 | 40.5 | 59.2 |
| Paper | [πŸ€—](https://huggingface.co/urchade) | 57.2 | 64.4 | 69.6 | 72.6 | 62.6 | 57.2 | 42.9 | 60.9 |
# Using repo
See https://github.com/urchade/GLiNER