Submitted by akhaliq 27 In-context Autoencoder for Context Compression in a Large Language Model · 5 authors
Submitted by akhaliq 26 Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution · 15 authors 2
Submitted by akhaliq 22 InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation · 13 authors
Submitted by akhaliq 22 Stack More Layers Differently: High-Rank Training Through Low-Rank Updates · 4 authors
Submitted by akhaliq 13 SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Task Planning · 6 authors 1
Submitted by akhaliq 9 Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events · 11 authors 1
Submitted by akhaliq 9 DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks · 6 authors
Submitted by akhaliq 9 Instruction Mining: High-Quality Instruction Data Selection for Large Language Models · 3 authors
Submitted by akhaliq 7 Generating Benchmarks for Factuality Evaluation of Language Models · 10 authors
Submitted by akhaliq 6 T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation · 5 authors
Submitted by akhaliq 3 VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models · 6 authors
Submitted by akhaliq 2 Giving Robots a Hand: Learning Generalizable Manipulation with Eye-in-Hand Human Video Demonstrations · 3 authors