facebook
/

npm

Inference Endpoints

Model card Files Files and versions Community

npm / README.md

sewon's picture

Update README.md

6806f78 about 2 years ago

|

history blame contribute delete

2.14 kB

	---
	license: cc-by-nc-4.0
	---

	# NPM

	NPM is a nonparametric masked language model, pretrained on English text data.
	It was introduced by ["Nonparametric Masked Language Modeling"][paper]
	and first released in [facebookresearch/NPM][repo].

	### Model description

	NPM consists of an encoder and a reference corpus, and models a nonparametric distribution over a reference corpus.
	The key idea is to map all the phrases in the corpus into a dense vector space using the
	encoder and, when given a query with a MASK at inference, use the encoder to locate the nearest
	phrase from the corpus and fill in the MASK.

	### Intended uses & limitations
	While this repo includes the encoder weights, NPM has to be used together with a datstore.
	For more details on how to use NPM, please refer to the [original repo][repo].

	Note that this model is primarily for filling in a MASK token. Future work can investigate how to use NPM for text generation.

	### Training procedure

	NPM was trained on English Wikipedia (August 2019) and an English portion of CC-News (Mackenzie et al. (2020), February 2019), which contains 13B tokens in total.
	NPM used the model architecture and initial weights of RoBERTa large (Liu et al., 2019), consisting of 354M parameters.
	Training is done for 100,000 steps, using thirty-two 32GB GPUs.

	More details about training can be found in the [paper][paper].
	Code for training NPM can be found in the [original repo][repo].

	### Evaluation results
	NPM is evaluated on nine closed-set tasks (tasks with a small set of options given)
	and seven open-set tasks (tasks whose answers are arbitrary-length).
	NPM consistently outperforms significantly larger models such as GPT-3, OPT and T5.
	Detailed results can be found from the [paper][paper].

	### BibTeX entry and citation info
	```
	@article{ min2022nonparametric,
	title={ Nonparametric Masked Language Modeling },
	author={ Min, Sewon and Shi, Weijia and Lewis, Mike and Chen, Xilun and Yih, Wen-tau and Hajishirzi, Hannaneh and Zettlemoyer, Luke },
	year={ 2022 }
	}
	```

	[paper]: https://arxiv.org/abs/2212.01349
	[repo]: https://github.com/facebookresearch/NPM