Safetensors

IPA CHILDES Models

Phoneme-based GPT-2 models trained on the largest 11 sections of the IPA-CHILDES dataset for our paper IPA-CHILDES & G2P+: Feature-Rich Resources for Cross-Lingual Phonology and Phonemic Language Modeling.

All models have 5M non-embedding parameters and were trained on 1.8M tokens from their language. These models were then probed for phonetic features using the corresponding inventories in Phoible. Check out the paper for more details. Training and analysis scripts can be found here.

To load a model:

from transformers import AutoModel
dutch_model = AutoModel.from_pretrained('phonemetransformers/ipa-childes-models', subfolder='Dutch')
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train phonemetransformers/ipa-childes-models

Collection including phonemetransformers/ipa-childes-models