KAILAS / README.md
Muthukumaran's picture
Update README.md
f01d42f
|
raw
history blame
2.9 kB
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: fill-mask
tags:
- climate
- biology
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This domain-adapted,(RoBERTa)[https://huggingface.co/roberta-base] based, Encoder-only transformer model is finetuned using select scientific journals and articles related to NASA Science Mission Directorate(SMD). It's intended purpose is to aid in NLP efforts within NASA. e.g.: Information retrieval, Intelligent search and discovery.
## Model Details
- RoBERTa as base model
- Custom tokenizer
- 125M parameters
- Masked Language Modeling (MLM) pretraining strategy
### Model Description
<!-- - **Developed by:** NASA IMPACT and IBM Research
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed] -->
## Uses
- Named Entity Recognition (NER), Information revreival, sentence-transformers.
## Training Details
### Training Data
The model was trained on the following datasets:
1. Wikipedia English dump of February 1, 2020
2. NASA own data
3. NASA papers
4. NASA Earth Science papers
5. NASA Astrophysics Data System
6. PubMed abstract
7. PMC : subset with commercial license
The sizes of the dataset is shown in the following chart.
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/CTNkn0WHS268hvidFmoqj.png)
<!-- Provide the basic links for the model.
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
-->
### Training Procedure
The model was trained on fairseq 0.12.1 with PyTorch 1.9.1 on transformer version 4.2.0. Masked Language Modeling (MLM) is the pretraining stragegy used.
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
## Evaluation
### BLURB Benchmark
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/K0IpQnTQmrfQJ1JXxn1B6.png)
### Pruned SQuAD2.0 (SQ2) Benchmark
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61099e5d86580d4580767226/R4oMJquUz4puah3lvd5Ve.png)
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
### NASA SMD Experts Benchmark
WIP!
## Citation
Please use the DOI provided by Huggingface to cite the model.
## Model Card Authors [optional]
Bishwaranjan Bhattacharjee, IBM Research
Muthukumaran Ramasubramanian, NASA-IMPACT ([email protected])
## Model Card Contact
Muthukumaran Ramasubramanian ([email protected])