RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published 11 days ago • 131
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining Paper • 2311.08849 • Published Nov 15, 2023 • 5
Datasets: NeurIPS LLM Challenge 2023 Collection Datasets that were under consideration for usage in my submission to the 2023 NeurIPS Large Language Model Efficiency Challenge. • 31 items • Updated Apr 10, 2024 • 2