Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,69 @@ language:
|
|
6 |
- en
|
7 |
metrics:
|
8 |
- f1
|
9 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
- en
|
7 |
metrics:
|
8 |
- f1
|
9 |
+
---
|
10 |
+
|
11 |
+
|
12 |
+
# Model Card for BioNExt
|
13 |
+
|
14 |
+
BioNExt, is an end-to-end Biomedical Relation Extraction and Classifcation system. The work utilized three modules, a Tagger (Named Entity Recognition), Linker (Entity Linking) and an Extractor (Relation Extraction and Classification).
|
15 |
+
|
16 |
+
This repositories contains two models:
|
17 |
+
|
18 |
+
1. Tagger: Named Entity Recognition module, which performs 6 class biomedical NER: **Genes, Diseases, Chemicals, Variants (mutations), Species, and Cell Lines**.
|
19 |
+
2. Extractor: Performs Relation Extraction and classification. The classes for the relation Extraction are: **Positive Correlation, Negative Correlation, Association, Binding, Drug Interaction, Cotreatment, Comparison, and Conversion.**
|
20 |
+
|
21 |
+
For a full description on how to utilize our end-to-end pipeline we point you towards our [GitHub](https://github.com/ieeta-pt/BioNExt) repository.
|
22 |
+
|
23 |
+
## Model Details
|
24 |
+
|
25 |
+
### Model Description
|
26 |
+
|
27 |
+
- **Developed by:** IEETA
|
28 |
+
- **Model type:** BERT Base
|
29 |
+
- **Language(s) (NLP):** English
|
30 |
+
- **License:** MIT
|
31 |
+
- **Finetuned from model:** BioLinkBERT-Large
|
32 |
+
|
33 |
+
### Model Sources
|
34 |
+
|
35 |
+
- **Repository:** [IEETA BioNExt GitHub](https://github.com/ieeta-pt/BioNExt)
|
36 |
+
- **Paper:** Towards Discovery: An End-to-End System for Uncovering Novel Biomedical Relations [Awaiting Publication]
|
37 |
+
|
38 |
+
**Authors:**
|
39 |
+
- Tiago Almeida ([ORCID: 0000-0002-4258-3350](https://orcid.org/0000-0002-4258-3350))
|
40 |
+
- Richard A A Jonker ([ORCID: 0000-0002-3806-6940](https://orcid.org/0000-0002-3806-6940))
|
41 |
+
- Rui Antunes ([ORCID: 0000-0003-3533-8872](https://orcid.org/0000-0003-3533-8872))
|
42 |
+
- João R Almeida ([ORCID: 0000-0003-0729-2264](https://orcid.org/0000-0003-0729-2264))
|
43 |
+
- Sérgio Matos ([ORCID: 0000-0003-1941-3983](https://orcid.org/0000-0003-1941-3983))
|
44 |
+
|
45 |
+
|
46 |
+
## Uses
|
47 |
+
|
48 |
+
Note we do not take any liability for the use of the model in any professional/medical domain. The model is intended for academic purposes only.
|
49 |
+
|
50 |
+
## How to Get Started with the Model
|
51 |
+
|
52 |
+
Please refer to our GitHub repository for more information on our end-to-end inference pipeline: [IEETA BioNExt GitHub](https://github.com/ieeta-pt/BioNExt)
|
53 |
+
|
54 |
+
## Training Details
|
55 |
+
|
56 |
+
### Training Data
|
57 |
+
|
58 |
+
The training data utilized was the BioRED corpus, wihtin the scope of the BioCreative-VIII challenge.
|
59 |
+
|
60 |
+
Ling Luo, Po-Ting Lai, Chih-Hsuan Wei, Cecilia N Arighi, Zhiyong Lu, BioRED: a rich biomedical relation extraction dataset, Briefings in Bioinformatics, Volume 23, Issue 5, September 2022, bbac282, https://doi.org/10.1093/bib/bbac282
|
61 |
+
|
62 |
+
|
63 |
+
### Results
|
64 |
+
|
65 |
+
As evaluated as an end to end system, our results are as follows:
|
66 |
+
- **Tagger**: 43.10
|
67 |
+
- **Linker**: 32.46
|
68 |
+
- **Extractor**: 24.59
|
69 |
+
|
70 |
+
## Citation
|
71 |
+
|
72 |
+
**BibTeX:**
|
73 |
+
|
74 |
+
[Awaiting Publication]
|