Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings Paper • 2305.13571 • Published May 23, 2023 • 2
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published 7 days ago • 103
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Paper • 2006.11477 • Published Jun 20, 2020 • 5