lil-bevo-x / README.md
venkatasg's picture
Update README.md
ad6f442
|
raw
history blame
669 Bytes
metadata
license: mit
language:
  - en
tags:
  - babylm

Lil-Bevo-X

Lil-Bevo-X is UT Austin's submission to the BabyLM challenge, specifically the strict track.

Link to GitHub Repo

TLDR:

  • Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 32k.

  • deberta-base-v3 trained on mixture of MAESTRO and 100M tokens for 3 epochs.

  • Model continues training for 100,000 steps with 128 sequence length.

  • Model continues training for 65,000 steps with 512 sequence length.

  • Model is trained with targeted linguistic masking for 1 epoch.

    This README will be updated with more details soon.