2 8

liudekai

ShadowWolf1999

letmego2022

AI & ML interests

None yet

Recent Activity

reacted to AdinaY's post with 🔥 10 days ago

A new OPEN Omni model just dropped by @Alibaba_Qwen on the hub🔥🤯 Qwen2.5-Omni: a 7B end-to-end multimodal model https://huggingface.co/Qwen/Qwen2.5-Omni-7B ✨ Thinker-Talker architecture ✨ Real-time voice & video chat ✨ Natural speech generation ✨ Handles text, image, audio & video

reacted to AdinaY's post with 🤗 10 days ago

new activity 13 days ago

deepseek-ai/DeepSeek-V3-0324:官网能用v3了吗？

View all activity

Organizations

None yet

ShadowWolf1999's activity

reacted to AdinaY's post with 🔥🤗 10 days ago

Post

1623

A new OPEN Omni model just dropped by @Alibaba_Qwen on the hub🔥🤯

Qwen2.5-Omni: a 7B end-to-end multimodal model
Qwen/Qwen2.5-Omni-7B

✨ Thinker-Talker architecture
✨ Real-time voice & video chat
✨ Natural speech generation
✨ Handles text, image, audio & video

1 reply

New activity in deepseek-ai/DeepSeek-V3-0324 13 days ago

官网能用v3了吗？

#2 opened 14 days ago by

leo009

reacted to onekq's post with 🤗 about 1 month ago

Post

2768

Necessity is mother of invention. To understand ⚡FlashMLA⚡ by
🐋DeepSeek 🐋, the first question to ask is why.

The keyword here is H800, a lower-end product tailored for export control. The purpose here is to squeeze out as much performance as possible.

But here is the most important takeaway: this invention benefits EVERYONE.

2 replies

liked a Space about 1 month ago

192

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

了解LLM训练的方方面面

liked 2 models about 1 month ago

moonshotai/Moonlight-16B-A3B-Instruct

Text Generation • Updated Mar 3 • 3.49k • 138

Wan-AI/Wan2.1-T2V-1.3B

Text-to-Video • Updated Mar 1 • 27.7k • • 300

New activity in deepseek-ai/DeepSeek-R1 2 months ago

this is the killer

#1 opened 3 months ago by

blackcat1402

liked 2 models 3 months ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 11 days ago • 1.42M • • 11.8k

openbmb/MiniCPM-o-2_6

Any-to-Any • Updated 10 days ago • 845k • 1.08k

reacted to mitkox's post with 🔥 3 months ago

Post

2500

Can it run DeepSeek V3 671B is the new 'can it run Doom'.

How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.

Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

5 replies

liked 2 models 3 months ago

xey/sldr_flux_nsfw_v2-studio

Text-to-Image • Updated Jan 14 • 281k • • 245

cognitivecomputations/Dolphin3.0-Llama3.1-8B

Updated Jan 5 • 2.77k • 157

liked a model 4 months ago

Datou1111/shou_xin

Text-to-Image • Updated 22 days ago • 391 • 870