Spaces:

cambridge-climb
/

README

Running

rdiehlmartinez commited on Oct 11, 2023

Commit

fb9d472

1 Parent(s): bcecc02

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,10 +1,19 @@
 ---
 title: README
-emoji: 🚀
 colorFrom: indigo
 colorTo: blue
 sdk: static
 pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card.

 ---
 title: README
+emoji: 📈
 colorFrom: indigo
 colorTo: blue
 sdk: static
 pinned: false
+license: mit
 ---
+This repository is Cambridge University NLP's submission to the 2023 BabyLM Challenge (CoNLL workshop).
+Our approach experiments with the following three variants of cognitively-motivated curriculum learning and analyze their effect on the performance of the model on linguistic evaluation tasks.
+1. **vocabulary curriculum** we analyze methods for constraining the vocabulary in the early stages of training to simulate cognitively more plausible learning curves.
+2. **data curriculum** we vary the order of the training instances based on i) infant-inspired expectations and ii) the learning behaviour of the model
+3. **objective curriculum** we explore different variations of combining the conventional masked language modelling task with a more coarse-grained word class prediction task to reinforce linguistic generalization capabilities.
+Overall, we find that various curriculum learning settings outperform our baseline in linguistic tasks. We moreover find that careful selection of model architecture, and training hyper-parameters yield substantial improvements over the default baselines provided by the BabyLM challenge.