Update README.md
Browse files# ✨ NeoBERT for NER
This repository hosts an NeoBERT model that was fine-tuned on the CoNLL-2003 NER dataset.
Please notice the following caveats:
* Work in progress - as hyperparameters can change or even the token classification implementation (`NeoBERTForTokenClassification`)!
* Don't expect BERT (Base) like performance at the moment
## 📝 Implementation Details
An own `NeoBERTForTokenClassification` class was implemented so that fine-tuning with Transformers is possible.
All experiments are conducted with Transformers `4.50.0.dev0` and the [token classification example](https://github.com/huggingface/transformers/tree/main/examples/pytorch) for PyTorch was used.
Example fine-tuning command:
```bash
python3 run_ner.py \
--model_name_or_path /home/stefan/Repositories/NeoBERT \
--dataset_name conll2003 \
--output_dir ./neobert-conll2003-lr1e-05-e10-bs16-1 \
--seed 1 \
--do_train \
--do_eval \
--per_device_train_batch_size 16 \
--num_train_epochs 10 \
--learning_rate 1e-05 \
--eval_strategy epoch \
--save_strategy epoch \
--overwrite_output_dir \
--trust_remote_code True \
--load_best_model_at_end \
--metric_for_best_model "eval_f1" \
--greater_is_better True
```
Notice: NeoBERT requires xFormers. Flash Attention is currently not used for the experiments.
## 📊 Performance
A very basic hyper-parameter search was performed, using five different learning rates. Performance is reported only on the development set of CoNLL-2003 with different seeds and reported averaged micro F1-Score in the following table:
| Configuration | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Avg. |
|:--------------------- | -----:| -----:| -----:| -----:| -----:| -----:|
| `bs=16,e=10,lr=1e-05` | 95.71 | 95.42 | 95.53 | 95.56 | 95.43 | 95.53 |
| `bs=16,e=10,lr=2e-05` | 95.25 | 95.33 | 95.28 | 95.35 | 95.26 | 95.29 |
| `bs=16,e=10,lr=3e-05` | 94.98 | 95.22 | 94.86 | 94.72 | 94.93 | 94.94 |
| `bs=16,e=10,lr=4e-05` | 94.61 | 94.39 | 94.57 | 94.65 | 94.87 | 94.61 |
| `bs=16,e=10,lr=5e-05` | 93.82 | 93.94 | 94.36 | 91.14 | 94.38 | 94.15 |