Cool-Whisper
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
Liang-Hsuan Tseng, Zih-Ching Chen, Wei-Shun Chang, Cheng-Kuang Lee, Tsung-Ren Huang, Hung-yi Lee
⚠️ Due to privacy and security concerns, this model will be temporarily taken offline. We are sorry for the inconvenience.
⚠️ 因為隱私安全疑慮,本模型將暫時下架。非常抱歉造成大家困擾。
Introduction
- Cool-whisper is a distilled version of Whisper, mainly focused on Mandarin-English code-switching ASR for people in Taiwan.
- We use 60,000 hours of unlabeled audio to train the model.
- Practically, we utilize knowledge not only from the large model (Whisper-large-v2) but also from the small model (Whisper-base).
Basic Usage
import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset
device = f"cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model_id = "andybi7676/cool-whisper-hf"
model = AutoModelForSpeechSeq2Seq.from_pretrained(
model_id, torch_dtype=torch_dtype, use_safetensors=True
)
processor = AutoProcessor.from_pretrained(model_id)
pipe = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
max_new_tokens=256,
return_timestamps=True,
torch_dtype=torch_dtype,
device=device,
)
dataset = load_dataset("andybi7676/ntuml2021_long", "default", split="test")
sample = dataset[0]["audio"]
# or your own audio path
# sample = "/your/path/to/audio.wav"
result = pipe(sample)
print("Basic Result: ")
print(result["text"])
# result with timestamps
print("\nResult with timestamps: ")
for chunk in result['chunks']:
print(chunk)
Faster-Whisper Support
Faster-Whisper is a commonly used tool to accelerate the transcription generation speed based on CTranslate2. We also deploy our model in the form of CTranslate2 to allow using it in faster-whisper. Please visit cool-whisper for more details.
- Downloads last month
- 1,335
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.