ALM

Activity Feed Request to join this org

AI & ML interests

Audio and Multimodal Learning

Organization Card

Community About org cards

ALM: Audio Language and Multimodal

ALM is a collaborative research group focused on deep learning for audio, language, and multimodal data.

About Us

Alkis Koudounas - PhD Student at Politecnico di Torino (Profile | polito.it)
Lorenzo Vaiani - PhD Student at Politecnico di Torino (Profile | polito.it)
Moreno La Quatra - Research Fellow at Kore University of Enna (Profile | unikore.it)

Projects

ARCH - Audio Representation Benchmark (Repo): A platform dedicated to benchmarking models for audio representations. Resaerch Paper
CALM - Contrastive Alignment of Language and Music: A project from the 1st Sound of AI Hackathon. CALM aligns songs with natural language descriptions, enabling music searches via text, voice, or facial expressions.
PACE - Podcast AI for Chapters and Episodes: PACE is a semantic search engine for podcasts. It enables users to search for specific parts of a podcast using natural language. The project was created for the AssemblyAI 50K Hackathon - Winter 2022.

Collections 2

spaces 3

ARCH

Compare audio representation models

CALM

models 14

ALM/whisper-el-small-augmented

Automatic Speech Recognition • 0.2B • Updated Feb 20 • 4

ALM/whisper-cy-small-augmented

Automatic Speech Recognition • 0.2B • Updated Feb 19 • 4

ALM/whisper-it-small-augmented

Automatic Speech Recognition • 0.2B • Updated Feb 19 • 8 • 1

ALM/whisper-sk-small-augmented

Automatic Speech Recognition • 0.2B • Updated Feb 18 • 4

ALM/whisper-da-small-augmented

Automatic Speech Recognition • 0.2B • Updated Feb 18 • 3

ALM/whisper-it-medium-augmented

Automatic Speech Recognition • 0.8B • Updated Feb 17 • 4 • 1

ALM/wav2vec2-large-audioset

Audio Classification • 0.3B • Updated Jan 24 • 146 • 1

ALM/hubert-base-audioset

Audio Classification • Updated Sep 1, 2024 • 222 • 3

ALM/hubert-large-audioset

Audio Classification • Updated Sep 1, 2024 • 130

ALM/wav2vec2-base-audioset

Audio Classification • Updated Sep 1, 2024 • 1.44k • 1

datasets 0

None public yet