Spaces:
Running
Running
rdiehlmartinez
commited on
Commit
β’
fb9d472
1
Parent(s):
bcecc02
Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,19 @@
|
|
1 |
---
|
2 |
title: README
|
3 |
-
emoji:
|
4 |
colorFrom: indigo
|
5 |
colorTo: blue
|
6 |
sdk: static
|
7 |
pinned: false
|
|
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: README
|
3 |
+
emoji: π
|
4 |
colorFrom: indigo
|
5 |
colorTo: blue
|
6 |
sdk: static
|
7 |
pinned: false
|
8 |
+
license: mit
|
9 |
---
|
10 |
|
11 |
+
This repository is Cambridge University NLP's submission to the 2023 BabyLM Challenge (CoNLL workshop).
|
12 |
+
|
13 |
+
Our approach experiments with the following three variants of cognitively-motivated curriculum learning and analyze their effect on the performance of the model on linguistic evaluation tasks.
|
14 |
+
|
15 |
+
1. **vocabulary curriculum** we analyze methods for constraining the vocabulary in the early stages of training to simulate cognitively more plausible learning curves.
|
16 |
+
2. **data curriculum** we vary the order of the training instances based on i) infant-inspired expectations and ii) the learning behaviour of the model
|
17 |
+
3. **objective curriculum** we explore different variations of combining the conventional masked language modelling task with a more coarse-grained word class prediction task to reinforce linguistic generalization capabilities.
|
18 |
+
|
19 |
+
Overall, we find that various curriculum learning settings outperform our baseline in linguistic tasks. We moreover find that careful selection of model architecture, and training hyper-parameters yield substantial improvements over the default baselines provided by the BabyLM challenge.
|