INDUS - NER-DEAL

Indus-NER-DEAL (nasa-smd-ibm-v0.1_NER_DEAL) is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search.
This specific fork was finetuned on SciX Digital Library (https://scixplorer.org/, formerly NASA-ADS) proprietary data to label text with DEAL labels (https://ui.adsabs.harvard.edu/WIESP/2022/LabelDefinitions)

Usage

from transformers import AutoModelForTokenClassification, AutoTokenizer
INDUS_NER_DEAL = AutoModelForTokenClassification.from_pretrained(pretrained_model_name_or_path='adsabs/nasa-smd-ibm-v0.1_NER_DEAL',
                                                                 revision=None,
                                                                )

INDUS_tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path='adsabs/nasa-smd-ibm-v0.1_NER_DEAL',
                                                 do_lower_case=False,
                                                )

Model Details

  • Base Model: RoBERTa
  • Tokenizer: Custom
  • Parameters: 125M

Training Data

  • 5K acknowledgements and full-text fragments from astronomy papers provided by NASA-SciX with manually tagged astronomical facilities and other entities of interest (e.g., celestial objects).
  • approximately 1.6M words
Downloads last month
5
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for adsabs/nasa-smd-ibm-v0.1_NER_DEAL

Finetuned
(4)
this model