liuyanyi
/

gliner_large_reproduce

Token Classification

Model card Files Files and versions Community

gliner_large_reproduce / README.md

liuyanyi's picture

Update README.md

9bbcfc6 verified 4 months ago

|

history blame contribute delete

1.86 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: gliner
	pipeline_tag: token-classification
	---

	# GLiNER-Large (Reproduce) Model

	This model is a reproduce version of GLiNER-large, the training hyperparameters are different from the original model.

	# Hyperparameters

	The detail of training hyperparameters can see in `deberta.yaml`.

	Except for config in `deberta.yaml`, i manually set the `lr_scheduler_type` to `cosine_with_min_lr` and `lr_scheduler_kwargs` to `{"min_lr_rate": 0.01}` in `train.py`:

	```
	training_args = TrainingArguments(
	...
	lr_scheduler_type="cosine_with_min_lr",
	lr_scheduler_kwargs={"min_lr_rate": 0.01},
	...
	)
	```

	NOTE: The result is not stable, i guess the random shuffle of the dataset is the reason.

	# Weights

	Here are two weights, one is the final model after 4k iterations, which has the best performance on the zero-shot evaluation, and the other is the model after full training.


	\| Model \| link \| AI \| literature \| music \| politics \| science \| movie \| restaurant \| Average \|
	\| :--------: \| :-------------------------------------------------------------------: \| :---: \| :--------: \| :---: \| :------: \| :-----: \| :---: \| :--------: \| :-----: \|
	\| iter_4000 \| [🤗](https://huggingface.co/liuyanyi/gliner_large_reproduce_iter_4000) \| 56.7 \| 65.1 \| 69.6 \| 74.2 \| 60.9 \| 60.6 \| 39.7 \| 61.0 \|
	\| iter_10000 \| [🤗](https://huggingface.co/liuyanyi/gliner_large_reproduce) \| 55.1 \| 62.9 \| 68.3 \| 71.6 \| 57.3 \| 58.4 \| 40.5 \| 59.2 \|
	\| Paper \| [🤗](https://huggingface.co/urchade) \| 57.2 \| 64.4 \| 69.6 \| 72.6 \| 62.6 \| 57.2 \| 42.9 \| 60.9 \|


	# Using repo
	See https://github.com/urchade/GLiNER