-
Ultra-Sparse Memory Network
Paper • 2411.12364 • Published • 22 -
Hyper-Connections
Paper • 2409.19606 • Published • 23 -
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Paper • 2411.03884 • Published • 26 -
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling
Paper • 2501.16975 • Published • 26
Open-Foundation-Models
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
1
models
None public yet
datasets
None public yet