---
license: cc-by-nc-3.0
---
# Danish medical word embeddings

MeDa-We was trained on a Danish medical corpus of 123M tokens. The word embeddings are 300-dimensional and are trained using [FastText](https://fasttext.cc/).

The embeddings were trained for 10 epochs using a window size of 5 and 10 negative samples.

The development of the corpus and word embeddings is described further in our [paper](https://openreview.net/forum?id=cc9USd2ec-). 

We also trained a transformer model on the developed corpus which can be found [here](https://huggingface.co/jannikskytt/MeDa-Bert).

### Citing
If you find our model helps, please consider citing this :)
```
@article{
}
```