Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
merve
's Collections
Jan 24 Releases
Jan 17 Releases βοΈ
Jan 10 Releases π¨οΈ
Dec 6 Releases π
Nov 29 Releases π²π²
Nov 22 Releases βοΈ
Nov 15 Releases π
Nov 1 Releases
MIT Talk 31/10 Papers
October 25 Releases
LOTUS πͺ·
New Depth Models
BRAVE Models π¦
Computer Vision Backbones π§©
Image Classification Models πΆ π±
Object Detection Models π₯₯
Image Segmentation Models π
Zero-shot Image Classification Models πΌοΈ
Image-to-Image Models π¨
Video Classification Models πΊ
Image-to-Text Models π
Text-to-Image Models π₯
Foundation Models for Vision π§©
Segment Anything Model
OWL-series π¦
SigLIP
Awesome Document AI
SegGPT
Vision Language Models Papers πΌοΈπ¬π
gvhf/owl
gv-hf/owl
merve/owl2
Depth Anything v2 Release
Document VLM Papers
Vision Language Leaderboards
Video Language Models
SAM2
NVEagle
Multimodal RAG
Zero-shot Segmentation
Jan 24 Releases
updated
2 days ago
Upvote
6
ostris/Flex.1-alpha
Text-to-Image
β’
Updated
8 days ago
β’
10.9k
β’
291
Qwen/Qwen2.5-Math-PRM-72B
Text Classification
β’
Updated
10 days ago
β’
920
β’
65
HuggingFaceTB/SmolVLM-500M-Instruct
Image-Text-to-Text
β’
Updated
4 days ago
β’
2.97k
β’
65
deepseek-ai/DeepSeek-R1
Text Generation
β’
Updated
about 20 hours ago
β’
131k
β’
2.96k
yale-nlp/MMVU
Viewer
β’
Updated
about 9 hours ago
β’
1k
β’
2.3k
β’
51
cais/hle
Viewer
β’
Updated
4 days ago
β’
3k
β’
810
β’
98
nvidia/AceMath-7B-Instruct
Text Generation
β’
Updated
10 days ago
β’
443
β’
9
tencent/Hunyuan3D-2
Image-to-3D
β’
Updated
3 days ago
β’
7.62k
β’
391
nvidia/AceMath-Instruct-Training-Data
Viewer
β’
Updated
10 days ago
β’
5.56M
β’
1.58k
β’
26
bytedance-research/UI-TARS-72B-DPO
Image-Text-to-Text
β’
Updated
1 day ago
β’
4.1k
β’
55
declare-lab/TangoFlux
Text-to-Audio
β’
Updated
5 days ago
β’
2.89k
β’
72
Running
on
Zero
534
π
Hunyuan3D-2.0
Text-to-3D and Image-to-3D Generation
nvidia/AceMath-72B-Instruct
Text Generation
β’
Updated
10 days ago
β’
70
β’
6
vidore/colSmol-256M
Updated
4 days ago
β’
263
β’
5
nvidia/AceMath-72B-RM
Text Generation
β’
Updated
10 days ago
β’
19
β’
6
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text
β’
Updated
1 day ago
β’
1.85k
β’
222
DAMO-NLP-SG/VideoLLaMA3-2B-Image
Visual Question Answering
β’
Updated
2 days ago
β’
102
β’
6
DAMO-NLP-SG/VideoLLaMA3-2B
Visual Question Answering
β’
Updated
about 20 hours ago
β’
287
β’
4
DAMO-NLP-SG/VideoLLaMA3-7B-Image
Visual Question Answering
β’
Updated
2 days ago
β’
231
β’
8
DAMO-NLP-SG/VideoLLaMA3-7B
Visual Question Answering
β’
Updated
about 20 hours ago
β’
877
β’
18
bytedance-research/UI-TARS-72B-SFT
Image-Text-to-Text
β’
Updated
1 day ago
β’
143
β’
8
bytedance-research/UI-TARS-7B-SFT
Image-Text-to-Text
β’
Updated
1 day ago
β’
1.44k
β’
112
bytedance-research/UI-TARS-7B-DPO
Image-Text-to-Text
β’
Updated
1 day ago
β’
6.11k
β’
75
HuggingFaceTB/SmolVLM-256M-Instruct
Image-Text-to-Text
β’
Updated
4 days ago
β’
3.86k
β’
75
HuggingFaceTB/SmolVLM-256M-Base
Image-Text-to-Text
β’
Updated
7 days ago
β’
515
β’
6
HuggingFaceTB/SmolVLM-500M-Base
Image-Text-to-Text
β’
Updated
7 days ago
β’
175
β’
4
Running
on
Zero
34
π
SmolVLM
Running
45
π¬
MiniMaxVL01
vidore/colSmol-500M
Updated
4 days ago
β’
197
β’
8
Running
29
π¨
SmolVLM 256M Instruct WebGPU
Running
21
π»
SmolVLM 500M Instruct WebGPU
HKUSTAudio/Llasa-3B
Text-to-Speech
β’
Updated
about 12 hours ago
β’
1.34k
β’
312
HKUSTAudio/xcodec2
Audio-to-Audio
β’
Updated
3 days ago
β’
3.21k
β’
20
Qwen/Qwen2.5-Math-PRM-7B
Text Classification
β’
Updated
10 days ago
β’
6.58k
β’
47
nvidia/AceMath-7B-RM
Text Generation
β’
Updated
10 days ago
β’
27
β’
5
vidore/colSmol-500M-base
Updated
4 days ago
β’
2
β’
1
vidore/colSmol-256M-base
Updated
4 days ago
β’
1
Running
on
Zero
273
π
TangoFlux
Text to Audio (Sound SFX) Generator
Running
on
Zero
47
π₯οΈ
Flex.1-alpha
Upvote
6
+2
Share collection
View history
Collection guide
Browse collections