Cool-Whisper

Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data

Liang-Hsuan Tseng, Zih-Ching Chen, Wei-Shun Chang, Cheng-Kuang Lee, Tsung-Ren Huang, Hung-yi Lee

⚠️ Due to privacy and security concerns, this model will be temporarily taken offline. We are sorry for the inconvenience.

⚠️ 因為隱私安全疑慮，本模型將暫時下架。非常抱歉造成大家困擾。

Introduction

Cool-whisper is a distilled version of Whisper, mainly focused on Mandarin-English code-switching ASR for people in Taiwan.
We use 60,000 hours of unlabeled audio to train the model.
Practically, we utilize knowledge not only from the large model (Whisper-large-v2) but also from the small model (Whisper-base).

Basic Usage

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset

device = f"cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

model_id = "andybi7676/cool-whisper-hf"

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, use_safetensors=True
)
processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    max_new_tokens=256,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=device,
)

dataset = load_dataset("andybi7676/ntuml2021_long", "default", split="test")
sample = dataset[0]["audio"]
# or your own audio path
# sample = "/your/path/to/audio.wav"

result = pipe(sample)
print("Basic Result: ")
print(result["text"])
# result with timestamps
print("\nResult with timestamps: ")
for chunk in result['chunks']:
  print(chunk)

Faster-Whisper Support

Faster-Whisper is a commonly used tool to accelerate the transcription generation speed based on CTranslate2. We also deploy our model in the form of CTranslate2 to allow using it in faster-whisper. Please visit cool-whisper for more details.

andybi7676
/

cool-whisper-hf

You need to agree to share your contact information to access this model

Cool-Whisper

Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data

Introduction

Basic Usage

Faster-Whisper Support