50 66 199

Nick Doiron

monsoon-nlp

https://mapmeld.com/plant-based-llms/

AI & ML interests

biology and multilingual models

Recent Activity

liked a dataset 1 day ago

htriedman/grokipedia-v0.1-dump

liked a model 22 days ago

utter-project/EuroLLM-9B

reacted to adlumal's post with 🚀 about 1 month ago

MLEB is the largest, most diverse, and most comprehensive benchmark for legal text embedding models. https://huggingface.co/blog/isaacus/introducing-mleb

View all activity

Organizations

liked a dataset 1 day ago

htriedman/grokipedia-v0.1-dump

Viewer • Updated 4 days ago • 39.5M • 31 • 1

liked a model 22 days ago

utter-project/EuroLLM-9B

Text Generation • 9B • Updated Dec 9, 2024 • 16.2k • 151

reacted to adlumal's post with 🚀 about 1 month ago

Post

2453

MLEB is the largest, most diverse, and most comprehensive benchmark for legal text embedding models. https://huggingface.co/blog/isaacus/introducing-mleb

upvoted an article about 1 month ago

Article

Introducing the Massive Legal Embedding Benchmark (MLEB)

Oct 17

•

upvoted a paper about 2 months ago

METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring

Paper • 2501.02045 • Published Jan 3 • 23

updated a model about 2 months ago

monsoon-nlp/tomatotomato-gLM2-150M-v0.1

Fill-Mask • 0.2B • Updated Sep 26 • 1

posted an update about 2 months ago

Post

451

Bio LLMs train on many genomes, but can we encode differences within a species? TomatoTomato adds pangenome tokens to represent a domestic tomato and a wild tomato in one sequence 🍅 🧬
monsoon-nlp/tomatotomato-gLM2-150M-v0.1

published a model about 2 months ago

monsoon-nlp/tomatotomato-gLM2-150M-v0.1

Fill-Mask • 0.2B • Updated Sep 26 • 1

liked 4 models about 2 months ago

liked a dataset about 2 months ago

futurehouse/BixBench

Viewer • Updated Oct 1 • 205 • 903 • 24

upvoted a collection about 2 months ago

Qwen3Guard

Collection

7 items • Updated Sep 30 • 57

liked a dataset 2 months ago

CohereLabsCommunity/afri-aya

Viewer • Updated Sep 17 • 2.47k • 138 • 10

reacted to lysandre's post with 🚀 2 months ago

Post

6672

We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!