YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Sylber
This is official implementation of Sylber: Syllabic Embedding Representation of Speech from Raw Audio.
Sylber is the first of its kind that yields extremely short tokens from raw audio (on average, 4.27 tokens/sec) through dynamic tokenization at the syllable granularity.
The model is developed and trained by Berkeley Speech Group.
Installation
The model can be installed through pypi for inference.
pip install sylber
Usage
from sylber import Segmenter
# Loading Sylber
segmenter = Segmenter(model_ckpt="sylber")
# Run Sylber
wav_file = "samples/sample.wav"
outputs = segmenter(wav_file, in_second=True) # in_second can be False to output segments in frame numbers.
# outputs = {"segments": numpy array of [start, end] of segment,
# "segment_features": numpy array of segment-averaged features,
# "hidden_states": numpy array of raw features used for segmentation.
Training
Please check https://github.com/Berkeley-Speech-Group/sylber for training the model.
license: apache-2.0
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.