|
--- |
|
language: |
|
- en |
|
tags: |
|
- Machine Learning |
|
- Research Papers |
|
- Scientific Language Model |
|
license: apache-2.0 |
|
--- |
|
|
|
## MLRoBERTa (RoBERTa pretrained on ML Papers) |
|
|
|
## How to use: |
|
``` |
|
from transformers import AutoTokenizer, AutoModel |
|
tok = AutoTokenizer.from_pretrained('shrutisingh/MLRoBERTa') |
|
model = AutoModel.from_pretrained('shrutisingh/MLRoBERTa') |
|
``` |
|
|
|
## Pretraining Details: |
|
This is a RoBERTa model trained on scientific documents. The dataset is composed of NeurIPS (1987-2019), CVPR (2013-2020), ICLR (2016-2020), ACL Anthology data (till 2019) paper title and abstracts, and ICLR paper reviews. |
|
|
|
## Citation: |
|
``` |
|
@inproceedings{singh2021compare, |
|
title={COMPARE: a taxonomy and dataset of comparison discussions in peer reviews}, |
|
author={Singh, Shruti and Singh, Mayank and Goyal, Pawan}, |
|
booktitle={2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)}, |
|
pages={238--241}, |
|
year={2021}, |
|
organization={IEEE} |
|
} |
|
``` |