SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory Paper • 2411.11922 • Published Nov 18 • 18
Video Understanding with Large Language Models: A Survey Paper • 2312.17432 • Published Dec 29, 2023 • 2
DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks Paper • 2307.05628 • Published Jul 11, 2023 • 9
Cross Contrasting Feature Perturbation for Domain Generalization Paper • 2307.12502 • Published Jul 24, 2023
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark Paper • 2410.03051 • Published Oct 4 • 4
Chasing Consistency in Text-to-3D Generation from a Single Image Paper • 2309.03599 • Published Sep 7, 2023 • 1
RT-Pose: A 4D Radar Tensor-based 3D Human Pose Estimation and Localization Benchmark Paper • 2407.13930 • Published Jul 18
Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering Paper • 2402.00827 • Published Feb 1 • 2
StableVideo: Text-driven Consistency-aware Diffusion Video Editing Paper • 2308.09592 • Published Aug 18, 2023 • 2
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding Paper • 2307.16449 • Published Jul 31, 2023 • 15