3 61 253

Sam Flin PRO

sflindrs

sflindrs

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

liked a Space 4 days ago

InstantX/InstantCharacter

upvoted a paper 4 days ago

InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework

View all activity

Organizations

None yet

sflindrs's activity

liked a Space 4 days ago

172

InstantCharacter

🐢

Customize characters with prompts and styles

liked 2 Spaces 10 days ago

Chat with Kimi-VL (Image, Agent, Video, PDF)

🚀

Chat with Kimi-VL-A3B-Instruct using text, images, and videos

Chat with Kimi-VL-A3B-Thinking

🤔

Chat with Kimi-VL-A3B-Thinking using text and images

liked a model 11 days ago

U4R/OmniCaptioner

Updated 12 days ago • 134 • 13

liked 2 models 13 days ago

Skywork/Skywork-R1V-38B-AWQ

Image-Text-to-Text • Updated 12 days ago • 254 • 5

Skywork/Skywork-R1V-38B

Image-Text-to-Text • Updated 12 days ago • 9.54k • 121

liked a model 14 days ago

wanglab/MedSAM2

Image Segmentation • Updated 12 days ago • 18

liked a Space 14 days ago

SmolVLM2 XSPFGenerator (VLC prototype)

🎞

Generate video highlights and playlist

liked a Space 15 days ago

116

PSHuman

🏃

PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF

liked 3 models 16 days ago

liked 2 Spaces 19 days ago

Gemma-3-R1984-27B

🔥

Reasoning + Multimodal + VLM + Deep Research + Agent

AvatarArtist

🎨

Open-Domain 4D Avatarization

liked a Space 26 days ago

103

Oryx

💬

Generate detailed descriptions from images and videos

liked 2 models 26 days ago

THUdyh/Oryx-ViT

Image Feature Extraction • Updated Mar 1 • 7

THUdyh/Ola-Image

Image-Text-to-Text • Updated Feb 25 • 65 • 3

liked a Space 26 days ago

Ola

📊

Generate text and audio responses from images and videos

liked a model 26 days ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated 7 days ago • 171k • 1.45k

liked a Space 26 days ago

259

Qwen2.5 Omni 7B Demo

🏆

Generate text and speech responses from text, images, or audio input