Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
merveΒ 
posted an update 2 days ago
Post
3556
sooo many open AI releases past week, let's summarize! πŸ€—
merve/april-11-releases-67fcd78be33d241c0977b9d2

multimodal
> Moonshot AI released Kimi VL Thinking, first working open-source multimodal reasoning model and Kimi VL Instruct, both 16B MoEs with 3B active params (OS)
> InternVL3 released based on Qwen2.5VL, 7 ckpts with various sizes (1B to 78B)

LLMs
> NVIDIA released Llama-3_1-Nemotron-Ultra-253B-v1 an LLM built on Llama 405B for reasoning, chat and tool use
> Agentica released DeepCoder-14B-Preview, fine-tuned version of DeepSeek-R1-Distilled-Qwen-14B on problem-test pairs, along with the compiled dataset
> Zyphra/ZR1-1.5B is a new small reasoning LLM built on R1-Distill-1.5B (OS)
> Skywork-OR1-32B-Preview is a new reasoning model by Skywork

Image Generation
> HiDream releases three new models, HiDream I1 Dev, I1 Full, and I1 fast for image generation (OS)

*OS ones have Apache 2.0 or MIT licenses

Multimodal reasoning is the way to go, especially with open source, congrats to the Moonshot team

Nifty50GPT missing from the list! 100 percent open source for Indian capital markets