5 398

Literate Goggles

literate-goggles

AI & ML interests

None yet

Recent Activity

upvoted an article about 22 hours ago

SigLIP 2: A better multilingual vision language encoder

upvoted a paper about 22 hours ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

upvoted a paper 1 day ago

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound

View all activity

Organizations

None yet

literate-goggles's activity

upvoted an article about 22 hours ago

Article

SigLIP 2: A better multilingual vision language encoder

2 days ago

• 67

upvoted a paper about 22 hours ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 2 days ago • 140

upvoted a paper 1 day ago

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound

Paper • 2502.05139 • Published 15 days ago • 1

upvoted a paper 2 days ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published 4 days ago • 34

upvoted a paper 3 days ago

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published 4 days ago • 73

upvoted 2 papers 4 days ago

Learning Getting-Up Policies for Real-World Humanoid Robots

Paper • 2502.12152 • Published 5 days ago • 35

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 6 days ago • 130

upvoted a paper 5 days ago

Region-Adaptive Sampling for Diffusion Transformers

Paper • 2502.10389 • Published 8 days ago • 52

upvoted a paper 7 days ago

Language Models Use Trigonometry to Do Addition

Paper • 2502.00873 • Published 20 days ago • 1

upvoted 5 papers 8 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 10 days ago • 139

Logical Reasoning in Large Language Models: A Survey

Paper • 2502.09100 • Published 9 days ago • 21

XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model

Paper • 2406.04904 • Published Jun 7, 2024 • 8

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Paper • 2502.05512 • Published 14 days ago • 1

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Paper • 2502.04128 • Published 16 days ago • 23

upvoted 2 articles 9 days ago

Article

Open-source DeepResearch – Freeing our search agents

19 days ago

• 1.07k

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

11 days ago

• 48

upvoted 4 papers 9 days ago

FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks

Paper • 2502.04465 • Published 16 days ago • 3

Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published 19 days ago • 62

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning

Paper • 2502.06533 • Published 12 days ago • 18

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 12 days ago • 133