YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Faster Whisper Transcription Service
Overview
This project uses the faster_whisper
Python package to provide an API endpoint for audio transcription. It utilizes
OpenAI's Whisper model (large-v3) for accurate and efficient speech-to-text conversion. The service is designed to be
deployed on Hugging Face endpoints.
Features
- Efficient Transcription: Utilizes the large-v3 Whisper model for high-quality transcription.
- Multilingual Support: Supports transcription in various languages, with default language set to German (de).
- Segmented Output: Returns transcribed text with segment IDs and timestamps for each transcribed segment.
Usage
import requests
import os
# Sample data dict with the link to the video file and the desired language for transcription
DATA = {
"inputs": "<base64_encoded_audio>",
"language": "de",
"task": "transcribe"
}
HF_ACCESS_TOKEN = os.environ.get("HF_TRANSCRIPTION_ACCESS_TOKEN")
API_URL = os.environ.get("HF_TRANSCRIPTION_ENDPOINT")
HEADERS = {
"Authorization": HF_ACCESS_TOKEN,
"Content-Type": "application/json"
}
response = requests.post(API_URL, headers=HEADERS, json=DATA)
print(response)
Logging
Logging is set up to debug level, providing detailed information during the transcription process, including the length of decoded bytes, the progress of segments being transcribed, and a confirmation once the inference is completed.
Deployment
This service is intended for deployment on Hugging Face endpoints. Ensure you follow Hugging Face's guidelines for deploying model endpoints.
- Downloads last month
- 7