Audio Models
Collection
12 items
•
Updated
OpenAI Whisper on Axera
目前支持 C++ 和 Python 两种语言
预编译模型下载
如需自行转换请参考模型转换
目前支持的模型规模:
目前测试过的语言:
apt install, pip install 等指令推荐在板上安装Miniconda管理虚拟环境,安装方法如下:
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
source ~/miniconda3/bin/activate
conda init --all
安装Whisper依赖
cd python
conda create -n whisper python=3.12
conda activate whisper
pip3 install -r requirements.txt
参考 https://github.com/AXERA-TECH/pyaxengine 安装 NPU Python API
在0.1.3rc2上测试通过,可通过
pip install https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc2/axengine-0.1.3-py3-none-any.whl
安装,或把版本号更改为你想使用的版本
登陆开发板后
输入命令
cd python
conda activate whisper
python3 main.py --model_type small --model_path ../models-ax650 --wav ../demo.wav --language zh
输出结果
(whisper) root@ax650:/mnt/data/Github/whisper.axera/python# python whisper_cli.py -t tiny -w ../demo.wav
[INFO] Available providers: ['AxEngineExecutionProvider']
{'wav': '../demo.wav', 'model_type': 'tiny', 'model_path': '../models-ax650', 'language': 'zh', 'task': 'transcribe'}
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Chip type: ChipType.MC50
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Engine version: 2.12.0s
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.0 76f70fdc
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.0 76f70fdc
ASR result:
擅职出现交易几乎停止的情况
RTF: 0.11406774537746188
运行参数说明:
| 参数名称 | 说明 | 默认值 |
|---|---|---|
| --wav | 输入音频文件 | |
| --model_type/-t | 模型类型, tiny/base/small | |
| --model_path/-p | 模型所在目录 | ../models |
| --language/-l | 识别语言 | zh |
(whisper) root@ax650:/mnt/data/Github/whisper.axera/python# python whisper_svr.py
[INFO] Available providers: ['AxEngineExecutionProvider']
Server started at http://0.0.0.0:8000
测试服务端
python test_svr.py
在 AX650N 设备上执行
cd cpp
./whisper_cli -w ../demo.wav -t tiny
或
cd cpp
./whisper_cli --model_type small -w ../demo.wav
输出结果
(whisper) root@ax650:/mnt/data/HF/Whisper/cpp/ax650# ./whisper_cli -w ../../demo.wav -t tiny
wav_file: ../../demo.wav
model_path: ../../models-ax650
model_type: tiny
language: zh
Init whisper success, take 0.3540seconds
Result: 甚至出现交易几乎停止的情况
RTF: 0.0968
cd cpp/ax650
./whisper_srv --model_type tiny --language zh --port 8080
curl命令行测试(请自行替换IP和端口):
ffmpeg -i demo.wav -f f32le -c:a pcm_f32le - 2>/dev/null | \
curl -X POST 10.126.33.192:8080/asr \
-H "Content-Type: application/octet-stream" \
--data-binary @-
RTF: Real-Time Factor
CPP:
| Models | AX650N | AX630C |
|---|---|---|
| Whisper-Tiny | 0.08 | |
| Whisper-Base | 0.11 | 0.35 |
| Whisper-Small | 0.24 | |
| Whisper-Turbo | 0.48 |
Python:
| Models | AX650N | AX630C |
|---|---|---|
| Whisper-Tiny | 0.12 | |
| Whisper-Base | 0.16 | 0.35 |
| Whisper-Small | 0.50 | |
| Whisper-Turbo | 0.60 |
| Models | AX650N | AX630C |
|---|---|---|
| Whisper-Tiny | 0.24 | |
| Whisper-Base | 0.18 | |
| Whisper-Small | 0.11 | |
| Whisper-Turbo | 0.06 |
若要复现测试结果,请按照以下步骤:
解压数据集:
unzip datasets.zip
运行测试脚本:
cd python
conda activate whisper
python test_wer.py -d aishell --gt_path ../datasets/ground_truth.txt --model_type tiny
Python:
| Models | CMM(MB) | OS(MB) |
|---|---|---|
| Whisper-Tiny | 332 | 512 |
| Whisper-Base | 533 | 644 |
| Whisper-Small | 1106 | 906 |
| Whisper-Turbo | 2065 | 2084 |
C++:
| Models | CMM(MB) | OS(MB) |
|---|---|---|
| Whisper-Tiny | 332 | 31 |
| Whisper-Base | 533 | 54 |
| Whisper-Small | 1106 | 146 |
| Whisper-Turbo | 2065 | 86 |