Respeecher/ukrainian-data2vec-asr

This model is a fine-tuned version of Respeecher/ukrainian-data2vec on the Common Voice 11.0 dataset Ukrainian Train part. It achieves the following results:

  • eval_wer: 17.634350000973198
  • test_wer: 17.042283338786351

How to Get Started with the Model

from transformers import AutoProcessor, Data2VecAudioForCTC
import torch
from datasets import load_dataset, Audio

dataset = load_dataset("mozilla-foundation/common_voice_11_0", "uk", split="test")
# Resample
dataset = dataset.cast_column("audio", Audio(sampling_rate=16_000))

processor = AutoProcessor.from_pretrained("Respeecher/ukrainian-data2vec-asr")
model = Data2VecAudioForCTC.from_pretrained("Respeecher/ukrainian-data2vec-asr")
model.eval()

sampling_rate = dataset.features["audio"].sampling_rate
inputs = processor(dataset[1]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
predicted_ids = torch.argmax(logits, dim=-1)

transcription = processor.batch_decode(predicted_ids)
transcription[0]

Training Details

Training code and instructions are available on our github

Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Respeecher/ukrainian-data2vec-asr

Evaluation results