File size: 2,870 Bytes
8e7c856 5c3e361 419bbb3 5c3e361 419bbb3 8e7c856 5c3e361 419bbb3 5c3e361 419bbb3 5c3e361 419bbb3 5c3e361 419bbb3 55a1ca5 a1d08f2 419bbb3 5c3e361 419bbb3 5c3e361 419bbb3 46e33f5 419bbb3 5c3e361 419bbb3 5c3e361 419bbb3 5c3e361 419bbb3 5c3e361 f01d42f a381c3f 46e33f5 d9915fd a381c3f 46e33f5 a381c3f 46e33f5 d9915fd 46e33f5 0d30fdd 46e33f5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 |
---
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: fill-mask
tags:
- climate
- biology
---
# Model Card for nasa-smd-ibm-v0.1
nasa-smd-ibm-v0.1 is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search.
## Model Details
- **Base Model**: RoBERTa
- **Tokenizer**: Custom
- **Parameters**: 125M
- **Pretraining Strategy**: Masked Language Modeling (MLM)
## Training Data
- Wikipedia English (Feb 1, 2020)
- AGU Publications
- AMS Publications
- Scientific papers from Astrophysics Data Systems
- PubMed abstracts
- PMC (commercial license subset)

## Training Procedure
- **Framework**: fairseq 0.12.1 with PyTorch 1.9.1
- **transformers Version**: 4.2.0
- **Strategy**: Masked Language Modeling (MLM)
## Evaluation
- BLURB Benchmark
- Pruned SQuAD2.0 (SQ2) Benchmark
- NASA SMD Experts Benchmark (WIP)


## Uses
- Named Entity Recognition (NER)
- Information Retrieval
- Sentence Transformers
## Citation
If you find this work useful, please cite using the following bibtex citation:
```bibtex
@misc {nasa-impact_2023,
author = {Masayasu Maraoka and Bishwaranjan Bhattacharjee and Muthukumaran Ramasubramanian and Ikhsa Gurung and Rahul Ramachandran and Manil Maskey and Kaylin Bugbee and Rong Zhang and Yousuf El Kurdis and Mike Little and Elizabeth Fancher and Lauren Sanders and Sylvain Costes and Sergi Blanco-Cuaresma and Kelly Lockhart and Thomas Allen and Felix Grazes and Megan Ansdell and Alberto Accomazzi and Sanaz Vahidinia and Ryan McGranaghan and Armin Mehrabian and Tsendgar Lee},
title = { nasa-smd-ibm-v0.1 (Revision f01d42f) },
year = 2023,
url = { https://huggingface.co/nasa-impact/nasa-smd-ibm-v0.1 },
doi = { 10.57967/hf/1429 },
publisher = { Hugging Face }
}
```
## Attribution
IBM Research
- Masayasu Maraoka
- Bishwaranjan Bhattacharjee
- Rong Zhang
- Yousuf El Kurdis
NASA SMD
- Muthukumaran Ramasubramanian
- Iksha Gurung
- Rahul Ramachandran
- Manil Maskey
- Kaylin Bugbee
- Mike Little
- Elizabeth Fancher
- Lauren Sanders
- Sylvain Costes
- Sergi Blanco-Cuaresma
- Kelly Lockhart
- Thomas Allen
- Felix Grazes
- Megan Ansdell
- Alberto Accomazzi
- Sanaz Vahidinia
- Ryan McGranaghan
- Armin Mehrabian
- Tsendgar Lee |