Retrofitting (Large) Language Models with Dynamic Tokenization Paper • 2411.18553 • Published Nov 27, 2024 • 2
Cross-Tokenizer Distillation via Approximate Likelihood Matching Paper • 2503.20083 • Published 20 days ago • 1
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation Paper • 2406.16678 • Published Jun 24, 2024 • 16
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation Paper • 2406.16678 • Published Jun 24, 2024 • 16