Catherine Arnett

catherinearnett

AI & ML interests

multilingual NLP, tokenization

Recent Activity

liked a dataset 4 days ago
jumelet/multiblimp
updated a model about 1 month ago
catherinearnett/B-GPT_pl_en_sequential
updated a model about 1 month ago
catherinearnett/B-GPT_en_pl_sequential
View all activity

Organizations

Blog-explorers's profile picture Language and Cognition Lab (UCSD)'s profile picture PleIAs's profile picture

Articles 4

Article
84

They Said It Couldn’t Be Done

Article
101

Releasing the largest multilingual open pretraining dataset

datasets

None public yet