Kaan Bera Güner

tomar753

AI & ML interests

Large Language Models, Small Language Models, Mid-sized Language Models, Tiny Language Models, Ginormous Language Models

Recent Activity

liked a model 3 days ago

SicariusSicariiStuff/Oni_Mitsubishi_12B

liked a model 3 days ago

sesame/csm-1b

liked a model 3 days ago

CohereForAI/c4ai-command-a-03-2025

View all activity

Organizations

tomar753's activity

upvoted a collection 4 days ago

Gemma 3 Release

Collection

9 items • Updated 3 days ago • 252

upvoted a collection 12 days ago

C4AI Aya Vision

Collection

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 12 days ago • 64

upvoted 2 collections about 1 month ago

Tulu 3 Models

Collection

All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 3 days ago • 94

R1 Multilingual

Collection

5 items • Updated Jan 31 • 10

upvoted 3 collections about 2 months ago

upvoted a collection 2 months ago

Dolphin 3.0

Collection

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7 • 108

upvoted a paper 3 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 140

upvoted a collection 5 months ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 24 days ago • 246

upvoted an article 5 months ago

Article

Fixing Gradient Accumulation

Oct 16, 2024

• 51

upvoted 3 collections 6 months ago

Emu3

Collection

Emu3: Next-Token Prediction is All You Need • 7 items • Updated Feb 13 • 69

Llama 3.2

Collection

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 576

Molmo

Collection

Artifacts for open multimodal language models. • 5 items • Updated 3 days ago • 299

upvoted an article 6 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 225

upvoted a paper 7 months ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 59

upvoted 2 collections 7 months ago

Hermes 3

Collection

The Hermes 3 Series of Models • 12 items • Updated about 1 month ago • 112

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated Jan 17 • 60

upvoted a collection 8 months ago

Lumimaid 0.2

Collection

4 items • Updated Jul 26, 2024 • 18

upvoted a paper 8 months ago

Adam-mini: Use Fewer Learning Rates To Gain More

Paper • 2406.16793 • Published Jun 24, 2024 • 68