The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 194
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds Paper • 2312.09246 • Published Dec 14, 2023 • 9
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation Paper • 2312.09251 • Published Dec 14, 2023 • 10
Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking Paper • 2312.09244 • Published Dec 14, 2023 • 11
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation Paper • 2312.08754 • Published Dec 14, 2023 • 11
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks Paper • 2312.08583 • Published Dec 14, 2023 • 12
General Object Foundation Model for Images and Videos at Scale Paper • 2312.09158 • Published Dec 14, 2023 • 12
LIME: Localized Image Editing via Attention Regularization in Diffusion Models Paper • 2312.09256 • Published Dec 14, 2023 • 12
FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection Paper • 2312.09252 • Published Dec 14, 2023 • 13
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention Paper • 2312.08618 • Published Dec 14, 2023 • 15
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance Paper • 2312.08889 • Published Dec 13, 2023 • 15
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions Paper • 2312.08578 • Published Dec 14, 2023 • 20
ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields Paper • 2312.08136 • Published Dec 13, 2023 • 7
Clockwork Diffusion: Efficient Generation With Model-Step Distillation Paper • 2312.08128 • Published Dec 13, 2023 • 15
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects Paper • 2312.08344 • Published Dec 13, 2023 • 13