Memory
Prompt is text-based memory. System II prompting is updating memory. Parametric memory is long-term, while prompt-based are short-tem.
- Paper • 2409.08775 • Published
OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering
Paper • 2409.08250 • Published • 1Synthetic continued pretraining
Paper • 2409.07431 • Published • 2WonderWorld: Interactive 3D Scene Generation from a Single Image
Paper • 2406.09394 • Published • 3
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
Paper • 2405.13792 • Published • 1Note Projecting document chunk embedding vector directly into hidden space for xRAG ! Explicit memory is expensive and dumb for RAG, mid-term memory relies on a 'projector', long-term memory updates on the langauge decoding part of the model. I guess that could be the next step here.
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
Paper • 2401.02051 • Published • 1
Yo'LLaVA: Your Personalized Language and Vision Assistant
Paper • 2406.09400 • Published • 1Note Addition of new concept into VLM via soft-prompt tuning. Extra id token in vocabulary plus k visual feature embeddings enables customizing VLM towards personalized knowledge.
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 31Note Same thing. Personalization with soft-prompt user embedding, this one is on text-modality, less exciting than Yo'LLaVA in some sense.
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
Paper • 2409.12903 • Published • 21A-VL: Adaptive Attention for Large Vision-Language Models
Paper • 2409.14846 • PublishedShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Paper • 2406.05981 • Published • 12Human-like Episodic Memory for Infinite Context LLMs
Paper • 2407.09450 • Published • 59Contextual Document Embeddings
Paper • 2410.02525 • Published • 18LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Paper • 2410.10783 • Published • 26Agent-as-a-Judge: Evaluate Agents with Agents
Paper • 2410.10934 • Published • 18Retrospective Learning from Interactions
Paper • 2410.13852 • Published • 8SPIN: Self-Supervised Prompt INjection
Paper • 2410.13236 • Published • 1DAG-aware Transformer for Causal Effect Estimation
Paper • 2410.10044 • Published • 1SMART: Self-learning Meta-strategy Agent for Reasoning Tasks
Paper • 2410.16128 • Published • 1
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 109Note Interesting approach to address the (more horrible) context length issue for video processing -- explicit memory embedding using "previous prediction" (instead of re-processing previous frames, or caching previous attention KV values ....) Makes immediate sense ...