Reza Sayar's picture

In a Training Loop 🔄

Reza Sayar PRO

Reza2kn

·

AI & ML interests

None yet

Recent Activity

updated a model about 2 hours ago

Reza2kn/visualears-xlsr-300m-v2-cont-step1568-fp16

published a model about 2 hours ago

Reza2kn/visualears-xlsr-300m-v2-cont-step1568-fp16

updated a dataset about 3 hours ago

Reza2kn/visualears-full-pipeline-state

View all activity

Organizations

upvoted 2 papers about 23 hours ago

Audio Interaction Model

Paper • 2606.05121 • Published 6 days ago • 105

UniSHARP: Universal Sharp Monocular View Synthesis

Paper • 2606.07514 • Published 4 days ago • 13

upvoted a collection 1 day ago

Sapiens2

28 items • Updated 24 days ago • 40

upvoted a paper 1 day ago

Sapiens2

Paper • 2604.21681 • Published Apr 23 • 21

upvoted 2 collections 5 days ago

AudioMosaic

ICML2026 AudioMosaic: Contrastive Masked Audio Representation Learning • 15 items • Updated 29 days ago • 3

MOSS-Audio

An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated May 2 • 66

upvoted a collection 7 days ago

Cosmos3

Omnimodal World Models for Physical AI • 15 items • Updated 1 day ago • 101

upvoted a collection 8 days ago

gliner2 family

GLiNER2 extends the original GLiNER architecture to support multi-task information extraction with a schema-driven interface. • 7 items • Updated 24 days ago • 49

upvoted 6 papers 11 days ago

CubePart: An Open-Vocabulary Part-Controllable 3D Generator

Paper • 2605.28763 • Published 13 days ago • 14

GEM: Generative Supervision Helps Embodied Intelligence

Paper • 2605.28548 • Published 13 days ago • 41

InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published 15 days ago • 17

ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement

Paper • 2605.25569 • Published 15 days ago • 21

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published 20 days ago • 109

TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction

Paper • 2605.26115 • Published 15 days ago • 51

upvoted a collection 13 days ago

Bonsai Image

6 items • Updated 4 days ago • 84

upvoted 5 papers 14 days ago

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

Paper • 2604.05091 • Published Apr 6 • 47

RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments

Paper • 2604.26067 • Published Apr 28 • 74

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 124

HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents

Paper • 2604.07430 • Published Apr 8 • 189

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 247