Add ONNX and ORT models with quantization
Browse files- .gitattributes +4 -0
- README.md +85 -0
- README_ja.md +85 -0
- onnx_models/model.onnx +3 -0
- onnx_models/model_fp16.onnx +3 -0
- onnx_models/model_int8.onnx +3 -0
- onnx_models/model_opt.onnx +3 -0
- onnx_models/model_uint8.onnx +3 -0
- ort_models/model.ort +3 -0
- ort_models/model_fp16.ort +3 -0
- ort_models/model_int8.ort +3 -0
- ort_models/model_uint8.ort +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
ort_models/model.ort filter=lfs diff=lfs merge=lfs -text
|
37 |
+
ort_models/model_fp16.ort filter=lfs diff=lfs merge=lfs -text
|
38 |
+
ort_models/model_int8.ort filter=lfs diff=lfs merge=lfs -text
|
39 |
+
ort_models/model_uint8.ort filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# ONNX and ORT models with quantization of [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)
|
9 |
+
|
10 |
+
[日本語READMEはこちら](README_ja.md)
|
11 |
+
|
12 |
+
This repository contains the ONNX and ORT formats of the model [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased), along with quantized versions.
|
13 |
+
|
14 |
+
## License
|
15 |
+
The license for this model is "apache-2.0". For details, please refer to the original model page: [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased).
|
16 |
+
|
17 |
+
## Usage
|
18 |
+
To use this model, install ONNX Runtime and perform inference as shown below.
|
19 |
+
```python
|
20 |
+
# Example code
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# Load the tokenizer
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-multilingual-uncased')
|
28 |
+
|
29 |
+
# Prepare inputs
|
30 |
+
text = 'Replace this text with your input.'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# Specify the model paths
|
34 |
+
# Test both the ONNX model and the ORT model
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNX model
|
37 |
+
'ort_models/model.ort' # ORT format model
|
38 |
+
]
|
39 |
+
|
40 |
+
# Run inference with each model
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# Get the model extension
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# Load the model
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# Load the ORT format model
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# Load the ONNX model
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# Run inference
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# Display the output shapes
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# Display the results (add further processing if needed)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## Contents of the Model
|
66 |
+
This repository includes the following models:
|
67 |
+
|
68 |
+
### ONNX Models
|
69 |
+
- `onnx_models/model.onnx`: Original ONNX model converted from [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)
|
70 |
+
- `onnx_models/model_opt.onnx`: Optimized ONNX model
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16 quantized model
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8 quantized model
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8 quantized model
|
74 |
+
|
75 |
+
### ORT Models
|
76 |
+
- `ort_models/model.ort`: ORT model using the optimized ONNX model
|
77 |
+
- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
|
78 |
+
- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
|
79 |
+
- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
|
80 |
+
|
81 |
+
## Notes
|
82 |
+
Please adhere to the license and usage conditions of the original model [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased).
|
83 |
+
|
84 |
+
## Contribution
|
85 |
+
If you find any issues or have improvements, please create an issue or submit a pull request.
|
README_ja.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- onnx
|
5 |
+
- ort
|
6 |
+
---
|
7 |
+
|
8 |
+
# [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased) のONNXおよびORTモデルと量子化モデル
|
9 |
+
|
10 |
+
[Click here for the English README](README.md)
|
11 |
+
|
12 |
+
このリポジトリは、元のモデル [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased) をONNXおよびORT形式に変換し、さらに量子化したものです。
|
13 |
+
|
14 |
+
## ライセンス
|
15 |
+
このモデルのライセンスは「apache-2.0」です。詳細は元のモデルページ([google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased))を参照してください。
|
16 |
+
|
17 |
+
## 使い方
|
18 |
+
このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
|
19 |
+
```python
|
20 |
+
# サンプルコード
|
21 |
+
import onnxruntime as ort
|
22 |
+
import numpy as np
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
import os
|
25 |
+
|
26 |
+
# トークナイザーの読み込み
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-base-multilingual-uncased')
|
28 |
+
|
29 |
+
# 入力の準備
|
30 |
+
text = 'ここに入力テキストを置き換えてください。'
|
31 |
+
inputs = tokenizer(text, return_tensors='np')
|
32 |
+
|
33 |
+
# 使用するモデルのパスを指定
|
34 |
+
# ONNXモデルとORTモデルの両方をテストする
|
35 |
+
model_paths = [
|
36 |
+
'onnx_models/model_opt.onnx', # ONNXモデル
|
37 |
+
'ort_models/model.ort' # ORTフォーマットのモデル
|
38 |
+
]
|
39 |
+
|
40 |
+
# モデルごとに推論を実行
|
41 |
+
for model_path in model_paths:
|
42 |
+
print(f'\n===== Using model: {model_path} =====')
|
43 |
+
# モデルの拡張子を取得
|
44 |
+
model_extension = os.path.splitext(model_path)[1]
|
45 |
+
|
46 |
+
# モデルの読み込み
|
47 |
+
if model_extension == '.ort':
|
48 |
+
# ORTフォーマットのモデルをロード
|
49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
50 |
+
else:
|
51 |
+
# ONNXモデルをロード
|
52 |
+
session = ort.InferenceSession(model_path)
|
53 |
+
|
54 |
+
# 推論の実行
|
55 |
+
outputs = session.run(None, dict(inputs))
|
56 |
+
|
57 |
+
# 出力の形状を表示
|
58 |
+
for idx, output in enumerate(outputs):
|
59 |
+
print(f'Output {idx} shape: {output.shape}')
|
60 |
+
|
61 |
+
# 結果の表示(必要に応じて処理を追加)
|
62 |
+
print(outputs)
|
63 |
+
```
|
64 |
+
|
65 |
+
## モデルの内容
|
66 |
+
このリポジトリには、以下のモデルが含まれています。
|
67 |
+
|
68 |
+
### ONNXモデル
|
69 |
+
- `onnx_models/model.onnx`: [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased) から変換された元のONNXモデル
|
70 |
+
- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
|
71 |
+
- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
|
72 |
+
- `onnx_models/model_int8.onnx`: INT8による量子化モデル
|
73 |
+
- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
|
74 |
+
|
75 |
+
### ORTモデル
|
76 |
+
- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
|
77 |
+
- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
|
78 |
+
- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
|
79 |
+
- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
|
80 |
+
|
81 |
+
## 注意事項
|
82 |
+
元のモデル [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased) のライセンスおよび使用条件を遵守してください。
|
83 |
+
|
84 |
+
## 貢献
|
85 |
+
問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。
|
onnx_models/model.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fac8d5cd770838f46bc892c526b48a17516b63d12beb96ec2491a52559f1447e
|
3 |
+
size 669642950
|
onnx_models/model_fp16.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:69cb78fefd7729ae594d109c49f3895fd94645ce7555d8dcbd915bddb445978f
|
3 |
+
size 334966016
|
onnx_models/model_int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a70a78dbba52f64fe58f03877a7ff24183dc7c26e08ab11bae2264827436af86
|
3 |
+
size 168068040
|
onnx_models/model_opt.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7e7191eb0b6d28c0483eeebb442c3f1b775cd67b2959efcff8f738cfc5424814
|
3 |
+
size 669617067
|
onnx_models/model_uint8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:64851d477ce2d2678b67c4109685e417a3ab32d7a63a36805f6b9df374998a69
|
3 |
+
size 168068079
|
ort_models/model.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2dc1403bc786af7b7677bba7be8b36dfc1b1c3d30d748a46cff3a68a61a83163
|
3 |
+
size 669811672
|
ort_models/model_fp16.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:47ee4165006198c2bdf517dbcde1731fc5a4cce74c496f82ca5266b38df95fac
|
3 |
+
size 335514560
|
ort_models/model_int8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c7c61ea03e4214ee15a5e14a49f48664e3d0b1050daf7ebb47d1fca4ccd3115b
|
3 |
+
size 168232928
|
ort_models/model_uint8.ort
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2611320f17db4d78d6499c09fa749d302a66fa89b6c48afd5f9d3d1549681d20
|
3 |
+
size 168232928
|