stefan-it commited on
Commit
5324d02
·
verified ·
1 Parent(s): 31a1d2f

Update README.md

Browse files

# ✨ NeoBERT for NER

This repository hosts an NeoBERT model that was fine-tuned on the CoNLL-2003 NER dataset.

Please notice the following caveats:

* Work in progress - as hyperparameters can change or even the token classification implementation (`NeoBERTForTokenClassification`)!
* Don't expect BERT (Base) like performance at the moment

## 📝 Implementation Details

An own `NeoBERTForTokenClassification` class was implemented so that fine-tuning with Transformers is possible.

All experiments are conducted with Transformers `4.50.0.dev0` and the [token classification example](https://github.com/huggingface/transformers/tree/main/examples/pytorch) for PyTorch was used.

Example fine-tuning command:

```bash
python3 run_ner.py \
--model_name_or_path /home/stefan/Repositories/NeoBERT \
--dataset_name conll2003 \
--output_dir ./neobert-conll2003-lr1e-05-e10-bs16-1 \
--seed 1 \
--do_train \
--do_eval \
--per_device_train_batch_size 16 \
--num_train_epochs 10 \
--learning_rate 1e-05 \
--eval_strategy epoch \
--save_strategy epoch \
--overwrite_output_dir \
--trust_remote_code True \
--load_best_model_at_end \
--metric_for_best_model "eval_f1" \
--greater_is_better True
```

Notice: NeoBERT requires xFormers. Flash Attention is currently not used for the experiments.

## 📊 Performance

A very basic hyper-parameter search was performed, using five different learning rates. Performance is reported only on the development set of CoNLL-2003 with different seeds and reported averaged micro F1-Score in the following table:

| Configuration | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Avg. |
|:--------------------- | -----:| -----:| -----:| -----:| -----:| -----:|
| `bs=16,e=10,lr=1e-05` | 95.71 | 95.42 | 95.53 | 95.56 | 95.43 | 95.53 |
| `bs=16,e=10,lr=2e-05` | 95.25 | 95.33 | 95.28 | 95.35 | 95.26 | 95.29 |
| `bs=16,e=10,lr=3e-05` | 94.98 | 95.22 | 94.86 | 94.72 | 94.93 | 94.94 |
| `bs=16,e=10,lr=4e-05` | 94.61 | 94.39 | 94.57 | 94.65 | 94.87 | 94.61 |
| `bs=16,e=10,lr=5e-05` | 93.82 | 93.94 | 94.36 | 91.14 | 94.38 | 94.15 |

Files changed (1) hide show
  1. README.md +11 -3
README.md CHANGED
@@ -1,3 +1,11 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - eriktks/conll2003
5
+ language:
6
+ - en
7
+ base_model:
8
+ - chandar-lab/NeoBERT
9
+ tags:
10
+ - ner
11
+ ---