liudekai

ShadowWolf1999
·

AI & ML interests

None yet

Recent Activity

Organizations

None yet

ShadowWolf1999's activity

reacted to AdinaY's post with 🔥🤗 10 days ago
view post
Post
1623
A new OPEN Omni model just dropped by @Alibaba_Qwen on the hub🔥🤯

Qwen2.5-Omni: a 7B end-to-end multimodal model
Qwen/Qwen2.5-Omni-7B

✨ Thinker-Talker architecture
✨ Real-time voice & video chat
✨ Natural speech generation
✨ Handles text, image, audio & video
  • 1 reply
·
New activity in deepseek-ai/DeepSeek-V3-0324 13 days ago

官网能用v3了吗?

12
#2 opened 14 days ago by
leo009
reacted to onekq's post with 🤗 about 1 month ago
view post
Post
2768
Necessity is mother of invention. To understand ⚡FlashMLA⚡ by
🐋DeepSeek 🐋, the first question to ask is why.

The keyword here is H800, a lower-end product tailored for export control. The purpose here is to squeeze out as much performance as possible.

But here is the most important takeaway: this invention benefits EVERYONE.
  • 2 replies
·
New activity in deepseek-ai/DeepSeek-R1 2 months ago

this is the killer

5
#1 opened 3 months ago by
blackcat1402
reacted to mitkox's post with 🔥 3 months ago
view post
Post
2500
Can it run DeepSeek V3 671B is the new 'can it run Doom'.

How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.

Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
·