s0225

community

AI & ML interests

None defined yet.

Recent Activity

s0225's activity

merveĀ 
posted an update 2 days ago
view post
Post
3556
sooo many open AI releases past week, let's summarize! šŸ¤—
merve/april-11-releases-67fcd78be33d241c0977b9d2

multimodal
> Moonshot AI released Kimi VL Thinking, first working open-source multimodal reasoning model and Kimi VL Instruct, both 16B MoEs with 3B active params (OS)
> InternVL3 released based on Qwen2.5VL, 7 ckpts with various sizes (1B to 78B)

LLMs
> NVIDIA released Llama-3_1-Nemotron-Ultra-253B-v1 an LLM built on Llama 405B for reasoning, chat and tool use
> Agentica released DeepCoder-14B-Preview, fine-tuned version of DeepSeek-R1-Distilled-Qwen-14B on problem-test pairs, along with the compiled dataset
> Zyphra/ZR1-1.5B is a new small reasoning LLM built on R1-Distill-1.5B (OS)
> Skywork-OR1-32B-Preview is a new reasoning model by Skywork

Image Generation
> HiDream releases three new models, HiDream I1 Dev, I1 Full, and I1 fast for image generation (OS)

*OS ones have Apache 2.0 or MIT licenses
Ā·
merveĀ 
posted an update 25 days ago
view post
Post
4045
So many open releases at Hugging Face past week šŸ¤Æ recapping all here ā¤µļø merve/march-21-releases-67dbe10e185f199e656140ae

šŸ‘€ Multimodal
> Mistral AI released a 24B vision LM, both base and instruction FT versions, sota šŸ”„ (OS)
> with IBM we released SmolDocling, a sota 256M document parser with Apache 2.0 license (OS)
> SpatialLM is a new vision LM that outputs 3D bounding boxes, comes with 0.5B (QwenVL based) and 1B (Llama based) variants
> SkyWork released SkyWork-R1V-38B, new vision reasoning model (OS)

šŸ’¬ LLMs
> NVIDIA released new Nemotron models in 49B and 8B with their post-training dataset
> LG released EXAONE, new reasoning models in 2.4B, 7.8B and 32B
> Dataset: Glaive AI released a new reasoning dataset of 22M+ examples
> Dataset: NVIDIA released new helpfulness dataset HelpSteer3
> Dataset: OpenManusRL is a new agent dataset based on ReAct framework (OS)
> Open-R1 team released OlympicCoder, new competitive coder model in 7B and 32B
> Dataset: GeneralThought-430K is a new reasoning dataset (OS)

šŸ–¼ļø Image Generation/Computer Vision
> Roboflow released RF-DETR, new real-time sota object detector (OS) šŸ”„
> YOLOE is a new real-time zero-shot object detector with text and visual prompts šŸ„¹
> Stability AI released Stable Virtual Camera, a new novel view synthesis model
> Tencent released Hunyuan3D-2mini, new small and fast 3D asset generation model
> ByteDance released InfiniteYou, new realistic photo generation model
> StarVector is a new 8B model that generates svg from images
> FlexWorld is a new model that expands 3D views (OS)

šŸŽ¤ Audio
> Sesame released CSM-1B new speech generation model (OS)

šŸ¤– Robotics
> NVIDIA released GR00T, new robotics model for generalized reasoning and skills, along with the dataset

*OS ones have Apache 2.0 or MIT license
ariG23498Ā 
updated 17 models about 2 months ago