14 69

jingyun

hjy

huajingyun

AI & ML interests

NLP

Recent Activity

upvoted a paper 18 days ago

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

liked a model about 2 months ago

google/gemma-4-31B-it

upvoted a paper 4 months ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

View all activity

Organizations

None yet

upvoted a paper 18 days ago

Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization

Paper • 2605.10780 • Published 20 days ago • 33

liked a model about 2 months ago

google/gemma-4-31B-it

Image-Text-to-Text • 33B • Updated 4 days ago • 11.3M • • 2.83k

upvoted 3 papers 4 months ago

TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Paper • 2602.08711 • Published Feb 9 • 29

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published Feb 4 • 50

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

Paper • 2602.03510 • Published Feb 3 • 27

liked 2 datasets 4 months ago

JoeLeelyf/ViF-CoT-4K

Updated Dec 18, 2025 • 141 • 4

facebook/action100m-preview

Viewer • Updated Jan 29 • 120k • 2.83k • 145

upvoted a paper 5 months ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Paper • 2512.15560 • Published Dec 17, 2025 • 25

authored a paper 5 months ago

KlingAvatar 2.0 Technical Report

Paper • 2512.13313 • Published Dec 15, 2025 • 44

liked a model 5 months ago

facebook/mms-1b-all

Automatic Speech Recognition • 1.0B • Updated Jun 15, 2023 • 194k • 198

upvoted a paper 5 months ago

Kling-Omni Technical Report

Paper • 2512.16776 • Published Dec 18, 2025 • 174

upvoted 2 papers 7 months ago

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Paper • 2510.10395 • Published Oct 12, 2025 • 32

When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs

Paper • 2511.02243 • Published Nov 4, 2025 • 25

authored a paper 8 months ago

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Paper • 2510.10395 • Published Oct 12, 2025 • 32

authored a paper 9 months ago

Kwai Keye-VL 1.5 Technical Report

Paper • 2509.01563 • Published Sep 1, 2025 • 40

liked a Space 9 months ago

FineVision: Open Data is All You Need

📝

224

A new open-source dataset for training VLMs

liked 2 models 9 months ago

Kwai-Keye/Keye-VL-1_5-8B

Video-Text-to-Text • 9B • Updated Feb 5 • 189k • 68

deepseek-ai/DeepSeek-V3.1

Text Generation • 685B • Updated Sep 5, 2025 • 206k • • 823

liked a model 10 months ago

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • 7B • Updated Aug 19, 2025 • 10.2k • 230

upvoted a paper 11 months ago

Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese

Paper • 2110.06696 • Published Oct 13, 2021 • 2

jingyun

AI & ML interests

Recent Activity

Organizations

hjy's activity

FineVision: Open Data is All You Need