Understanding LLM - a VoladorLuYu Collection

VoladorLuYu 's Collections

Research on LLM

Generative Multiple Modality

Super Alignment

Foundation Machine Learning

Graph Foundation Multimodal Models

Symbolic LLM Reasoning

Data-efficient LLMs

Understanding LLM

synthetic code generation

Diffusion Models

LLM+Architecture

LLM+Self-Play RL

Understanding LLM

updated Nov 1

A Language Model's Guide Through Latent Space

Paper • 2402.14433 • Published Feb 22 • 1
The Hidden Space of Transformer Language Adapters

Paper • 2402.13137 • Published Feb 20
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

Paper • 2402.16438 • Published Feb 26
AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1 • 12
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral

Paper • 2403.01851 • Published Mar 4
The Hidden Attention of Mamba Models

Paper • 2403.01590 • Published Mar 3
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 61
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 42
Function Vectors in Large Language Models

Paper • 2310.15213 • Published Oct 23, 2023 • 1
Localizing Paragraph Memorization in Language Models

Paper • 2403.19851 • Published Mar 28 • 13
ROME: Memorization Insights from Text, Probability and Hidden State in Large Language Models

Paper • 2403.00510 • Published Mar 1 • 1
Large Language Models Struggle to Learn Long-Tail Knowledge

Paper • 2211.08411 • Published Nov 15, 2022 • 3
ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 91
How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published Jun 17 • 30
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models

Paper • 2406.12649 • Published Jun 18 • 15
Can LLMs Learn by Teaching? A Preliminary Study

Paper • 2406.14629 • Published Jun 20 • 19
Why Does the Effective Context Length of LLMs Fall Short?

Paper • 2410.18745 • Published Oct 24 • 17
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse

Paper • 2410.21333 • Published Oct 27 • 10