Nathan Simons's picture

Nathan Simons

JoeySalmons

·

AI & ML interests

I like AI

Recent Activity

upvoted a collection about 1 hour ago

liked a model about 5 hours ago

microsoft/Phi-4-multimodal-instruct

liked a model about 5 hours ago

microsoft/Phi-4-mini-instruct

View all activity

Organizations

None yet

JoeySalmons's activity

upvoted a collection about 1 hour ago

Phi-4

Phi-4 small language model. • 7 items • Updated about 4 hours ago • 58

upvoted 2 collections about 10 hours ago

Granite Vision Models

2 items • Updated 2 days ago • 3

Granite 3.2 Language Models

3 items • Updated about 16 hours ago • 5

upvoted a collection 3 days ago

Foundation Text-Generation Models Below 360M Parameters

Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 31 items • Updated 2 days ago • 24

upvoted a collection 5 days ago

Ovis2

Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 10 days ago • 51

upvoted a paper 6 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 7 days ago • 91

upvoted a collection 7 days ago

PaliGemma 2 Mix

13 items • Updated 8 days ago • 59

upvoted a collection 10 days ago

Step-Audio

Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 3 items • Updated 10 days ago • 28

upvoted a collection 12 days ago

Hamanasu

A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay. • 9 items • Updated 12 days ago • 4

upvoted a collection 15 days ago

OLMoE (January 2025)

Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app • 10 items • Updated 16 days ago • 9

upvoted an article 16 days ago

Article

Open R1: Update #2

By

and 6 others •

17 days ago

• 190

upvoted an article 22 days ago

Article

Open-source DeepResearch – Freeing our search agents

23 days ago

• 1.1k

upvoted a collection 22 days ago

SFTvsRL Models & Data

This collection contains 4 initial checkpoints for https://github.com/LeslieTrue/SFTvsRL and necessary data for V-IRL training. • 5 items • Updated 22 days ago • 8

upvoted a paper 22 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 29 days ago • 108

upvoted an article 28 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

about 1 month ago

• 776

upvoted a collection 29 days ago

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated about 6 hours ago • 201

upvoted 3 collections about 1 month ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 8 items • Updated 3 days ago • 370

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated about 22 hours ago • 102

InternLM3

6 items • Updated 16 days ago • 23

upvoted a collection about 2 months ago

Dolphin 3.0

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated 20 days ago • 95