File size: 813 Bytes
b59836c e909a74 0d4b898 b59836c cac56e9 b59836c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
---
license: apache-2.0
datasets:
- ivrit-ai/crowd-transcribe-v4
language:
- he
- en
base_model: openai/whisper-large-v2
pipeline_tag: automatic-speech-recognition
---
This is ivrit.ai's faster-whisper model, based on the ivrit-ai/whisper-v2-d4 Whisper model.
Training data includes 250 hours of volunteer-transcribed speech from the ivrit-ai/crowd-transcribe-v4 dataset, as well as 100 ours of professional transcribed speech from other sources.
Release date: September 8th, 2024.
# Prerequisites
pip3 install faster_whisper
# Usage
```
import faster_whisper
model = faster_whisper.WhisperModel('ivrit-ai/faster-whisper-v2-d4')
segs, _ = model.transcribe('media-file', language='he')
texts = [s.text for s in segs]
transcribed_text = ' '.join(texts)
print(f'Transcribed text: {transcribed_text}')
```
|