4M: Massively Multimodal Masked Modeling
Transcribe or translate audio quickly
Engage in multimedia chat with LLMs and ML models