Papers
arxiv:2310.19420
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
Published on Oct 30, 2023
Authors:
Abstract
This paper explores the use of latent bootstrapping, an alternative self-supervision technique, for pretraining language models. Unlike the typical practice of using self-supervision on discrete subwords, latent bootstrapping leverages contextualized embeddings for a richer supervision signal. We conduct experiments to assess how effective this approach is for acquiring linguistic knowledge from limited resources. Specifically, our experiments are based on the BabyLM shared task, which includes pretraining on two small curated corpora and an evaluation on four linguistic benchmarks.
Models citing this paper 0
No model linking this paper
Cite arxiv.org/abs/2310.19420 in a model README.md to link it from this page.
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Cite arxiv.org/abs/2310.19420 in a Space README.md to link it from this page.
Collections including this paper 0
No Collection including this paper
Add this paper to a
collection
to link it from this page.