Tollef J

tollefj

AI & ML interests

Coreference resolution, span prediction, summarization, topic modeling

Recent Activity

liked a model 4 days ago
google/gemma-3-27b-it
liked a model 4 days ago
google/gemma-3-12b-it
liked a model 4 days ago
google/gemma-3-4b-it
View all activity

Organizations

Hugging Face Discord Community's profile picture

tollefj's activity

view reply

Why are there so few languages involved in the training of these models? You argue that this data mix was selected "to create a corpus of European and most widely spoken languages, representing a broad range of alphabets and cultures."
But what is the relevance in other alphabets when, for example, you do not include any Nordic languages with large and high-quality datasets?

Prefixing it "Euro" seems odd in this context. You have selected a tiny fraction of languages - so name it accordingly :-)
It would also make sense to refer to EuroEval https://euroeval.com/leaderboards/

New activity in sentence-transformers/all-MiniLM-L6-v2 4 months ago

ignore this

#90 opened 4 months ago by
tollefj