Lil-Bevo

Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the strict-small track.

Link to GitHub Repo

TLDR:

  • Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.

  • deberta-small-v3 trained on mixture of MAESTRO and 10M tokens for 5 epochs.

  • Model continues training for 50 epochs on 10M tokens with sequence length of 128.

  • Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512.

    This README will be updated with more details soon.

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including venkatasg/lil-bevo