Nguyen Van Thanh's picture

3433

Nguyen Van Thanh

NguyenVanThanhHust

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

upvoted a paper 6 days ago

VideoMamba: State Space Model for Efficient Video Understanding

upvoted a paper 6 days ago

V3D: Video Diffusion Models are Effective 3D Generators

View all activity

Organizations

None yet

NguyenVanThanhHust's activity

upvoted 20 papers 6 days ago

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Paper • 2403.06764 • Published Mar 11, 2024 • 29

VideoMamba: State Space Model for Efficient Video Understanding

Paper • 2403.06977 • Published Mar 11, 2024 • 31

V3D: Video Diffusion Models are Effective 3D Generators

Paper • 2403.06738 • Published Mar 11, 2024 • 31

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

Paper • 2403.06775 • Published Mar 11, 2024 • 5

Pix2Gif: Motion-Guided Diffusion for GIF Generation

Paper • 2403.04634 • Published Mar 7, 2024 • 18

StableDrag: Stable Dragging for Point-based Image Editing

Paper • 2403.04437 • Published Mar 7, 2024 • 30

How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 24

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

Paper • 2403.04116 • Published Mar 7, 2024 • 7

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7, 2024 • 42

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 112

Multistep Consistency Models

Paper • 2403.06807 • Published Mar 11, 2024 • 16

CameraCtrl: Enabling Camera Control for Text-to-Video Generation

Paper • 2404.02101 • Published Apr 2, 2024 • 25

LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model

Paper • 2404.01331 • Published Mar 29, 2024 • 28

Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

Paper • 2405.17405 • Published May 27, 2024 • 17

Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control

Paper • 2405.17414 • Published May 27, 2024 • 12

Matryoshka Multimodal Models

Paper • 2405.17430 • Published May 27, 2024 • 34

PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

Paper • 2412.18608 • Published Dec 24, 2024 • 18

MMFactory: A Universal Solution Search Engine for Vision-Language Tasks

Paper • 2412.18072 • Published Dec 24, 2024 • 19

WavePulse: Real-time Content Analytics of Radio Livestreams

Paper • 2412.17998 • Published Dec 23, 2024 • 11

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

Paper • 2412.17780 • Published Dec 23, 2024 • 5