Lil-Bevo

Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the strict-small track.

Link to GitHub Repo

TLDR:

  • Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.

  • deberta-small-v3 trained on mixture of MAESTRO and 10M tokens for 5 epochs.

  • Model continues training for 50 epochs on 10M tokens with sequence length of 128.

  • Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512.

    This README will be updated with more details soon.

Downloads last month
105
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Collection including venkatasg/lil-bevo