To Softmax, or not to Softmax: that is the question when applying Active Learning for Transformer Models Paper • 2210.03005 • Published Oct 6, 2022
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics Paper • 2410.21272 • Published Oct 28, 2024 • 2
MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training Paper • 2502.20855 • Published Feb 28
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models Paper • 2501.06751 • Published Jan 12 • 32
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper • 2410.02707 • Published Oct 3, 2024 • 48