Add ONNX and ORT models with quantization
Browse files- .gitattributes +4 -0
- README.md +85 -0
- README_ja.md +85 -0
- onnx_models/model.onnx +3 -0
- onnx_models/model_fp16.onnx +3 -0
- onnx_models/model_int8.onnx +3 -0
- onnx_models/model_opt.onnx +3 -0
- onnx_models/model_uint8.onnx +3 -0
- ort_models/model.ort +3 -0
- ort_models/model_fp16.ort +3 -0
- ort_models/model_int8.ort +3 -0
- ort_models/model_uint8.ort +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
ort_models/model.ort filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
ort_models/model_fp16.ort filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
ort_models/model_int8.ort filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
ort_models/model_uint8.ort filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- onnx
|
| 5 |
+
- ort
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
# ONNX and ORT models with quantization of [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased)
|
| 9 |
+
|
| 10 |
+
[日本語READMEはこちら](README_ja.md)
|
| 11 |
+
|
| 12 |
+
This repository contains the ONNX and ORT formats of the model [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased), along with quantized versions.
|
| 13 |
+
|
| 14 |
+
## License
|
| 15 |
+
The license for this model is "apache-2.0". For details, please refer to the original model page: [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased).
|
| 16 |
+
|
| 17 |
+
## Usage
|
| 18 |
+
To use this model, install ONNX Runtime and perform inference as shown below.
|
| 19 |
+
```python
|
| 20 |
+
# Example code
|
| 21 |
+
import onnxruntime as ort
|
| 22 |
+
import numpy as np
|
| 23 |
+
from transformers import AutoTokenizer
|
| 24 |
+
import os
|
| 25 |
+
|
| 26 |
+
# Load the tokenizer
|
| 27 |
+
tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-large-uncased')
|
| 28 |
+
|
| 29 |
+
# Prepare inputs
|
| 30 |
+
text = 'Replace this text with your input.'
|
| 31 |
+
inputs = tokenizer(text, return_tensors='np')
|
| 32 |
+
|
| 33 |
+
# Specify the model paths
|
| 34 |
+
# Test both the ONNX model and the ORT model
|
| 35 |
+
model_paths = [
|
| 36 |
+
'onnx_models/model_opt.onnx', # ONNX model
|
| 37 |
+
'ort_models/model.ort' # ORT format model
|
| 38 |
+
]
|
| 39 |
+
|
| 40 |
+
# Run inference with each model
|
| 41 |
+
for model_path in model_paths:
|
| 42 |
+
print(f'\n===== Using model: {model_path} =====')
|
| 43 |
+
# Get the model extension
|
| 44 |
+
model_extension = os.path.splitext(model_path)[1]
|
| 45 |
+
|
| 46 |
+
# Load the model
|
| 47 |
+
if model_extension == '.ort':
|
| 48 |
+
# Load the ORT format model
|
| 49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
| 50 |
+
else:
|
| 51 |
+
# Load the ONNX model
|
| 52 |
+
session = ort.InferenceSession(model_path)
|
| 53 |
+
|
| 54 |
+
# Run inference
|
| 55 |
+
outputs = session.run(None, dict(inputs))
|
| 56 |
+
|
| 57 |
+
# Display the output shapes
|
| 58 |
+
for idx, output in enumerate(outputs):
|
| 59 |
+
print(f'Output {idx} shape: {output.shape}')
|
| 60 |
+
|
| 61 |
+
# Display the results (add further processing if needed)
|
| 62 |
+
print(outputs)
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
## Contents of the Model
|
| 66 |
+
This repository includes the following models:
|
| 67 |
+
|
| 68 |
+
### ONNX Models
|
| 69 |
+
- `onnx_models/model.onnx`: Original ONNX model converted from [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased)
|
| 70 |
+
- `onnx_models/model_opt.onnx`: Optimized ONNX model
|
| 71 |
+
- `onnx_models/model_fp16.onnx`: FP16 quantized model
|
| 72 |
+
- `onnx_models/model_int8.onnx`: INT8 quantized model
|
| 73 |
+
- `onnx_models/model_uint8.onnx`: UINT8 quantized model
|
| 74 |
+
|
| 75 |
+
### ORT Models
|
| 76 |
+
- `ort_models/model.ort`: ORT model using the optimized ONNX model
|
| 77 |
+
- `ort_models/model_fp16.ort`: ORT model using the FP16 quantized model
|
| 78 |
+
- `ort_models/model_int8.ort`: ORT model using the INT8 quantized model
|
| 79 |
+
- `ort_models/model_uint8.ort`: ORT model using the UINT8 quantized model
|
| 80 |
+
|
| 81 |
+
## Notes
|
| 82 |
+
Please adhere to the license and usage conditions of the original model [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased).
|
| 83 |
+
|
| 84 |
+
## Contribution
|
| 85 |
+
If you find any issues or have improvements, please create an issue or submit a pull request.
|
README_ja.md
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- onnx
|
| 5 |
+
- ort
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
# [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased) のONNXおよびORTモデルと量子化モデル
|
| 9 |
+
|
| 10 |
+
[Click here for the English README](README.md)
|
| 11 |
+
|
| 12 |
+
このリポジトリは、元のモデル [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased) をONNXおよびORT形式に変換し、さらに量子化したものです。
|
| 13 |
+
|
| 14 |
+
## ライセンス
|
| 15 |
+
このモデルのライセンスは「apache-2.0」です。詳細は元のモデルページ([google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased))を参照してください。
|
| 16 |
+
|
| 17 |
+
## 使い方
|
| 18 |
+
このモデルを使用するには、ONNX Runtimeをインストールし、以下のように推論を行います。
|
| 19 |
+
```python
|
| 20 |
+
# サンプルコード
|
| 21 |
+
import onnxruntime as ort
|
| 22 |
+
import numpy as np
|
| 23 |
+
from transformers import AutoTokenizer
|
| 24 |
+
import os
|
| 25 |
+
|
| 26 |
+
# トークナイザーの読み込み
|
| 27 |
+
tokenizer = AutoTokenizer.from_pretrained('google-bert/bert-large-uncased')
|
| 28 |
+
|
| 29 |
+
# 入力の準備
|
| 30 |
+
text = 'ここに入力テキストを置き換えてください。'
|
| 31 |
+
inputs = tokenizer(text, return_tensors='np')
|
| 32 |
+
|
| 33 |
+
# 使用するモデルのパスを指定
|
| 34 |
+
# ONNXモデルとORTモデルの両方をテストする
|
| 35 |
+
model_paths = [
|
| 36 |
+
'onnx_models/model_opt.onnx', # ONNXモデル
|
| 37 |
+
'ort_models/model.ort' # ORTフォーマットのモデル
|
| 38 |
+
]
|
| 39 |
+
|
| 40 |
+
# モデルごとに推論を実行
|
| 41 |
+
for model_path in model_paths:
|
| 42 |
+
print(f'\n===== Using model: {model_path} =====')
|
| 43 |
+
# モデルの拡張子を取得
|
| 44 |
+
model_extension = os.path.splitext(model_path)[1]
|
| 45 |
+
|
| 46 |
+
# モデルの読み込み
|
| 47 |
+
if model_extension == '.ort':
|
| 48 |
+
# ORTフォーマットのモデルをロード
|
| 49 |
+
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
|
| 50 |
+
else:
|
| 51 |
+
# ONNXモデルをロード
|
| 52 |
+
session = ort.InferenceSession(model_path)
|
| 53 |
+
|
| 54 |
+
# 推論の実行
|
| 55 |
+
outputs = session.run(None, dict(inputs))
|
| 56 |
+
|
| 57 |
+
# 出力の形状を表示
|
| 58 |
+
for idx, output in enumerate(outputs):
|
| 59 |
+
print(f'Output {idx} shape: {output.shape}')
|
| 60 |
+
|
| 61 |
+
# 結果の表示(必要に応じて処理を追加)
|
| 62 |
+
print(outputs)
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
## モデルの内容
|
| 66 |
+
このリポジトリには、以下のモデルが含まれています。
|
| 67 |
+
|
| 68 |
+
### ONNXモデル
|
| 69 |
+
- `onnx_models/model.onnx`: [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased) から変換された元のONNXモデル
|
| 70 |
+
- `onnx_models/model_opt.onnx`: 最適化されたONNXモデル
|
| 71 |
+
- `onnx_models/model_fp16.onnx`: FP16による量子化モデル
|
| 72 |
+
- `onnx_models/model_int8.onnx`: INT8による量子化モデル
|
| 73 |
+
- `onnx_models/model_uint8.onnx`: UINT8による量子化モデル
|
| 74 |
+
|
| 75 |
+
### ORTモデル
|
| 76 |
+
- `ort_models/model.ort`: 最適化されたONNXモデルを使用したORTモデル
|
| 77 |
+
- `ort_models/model_fp16.ort`: FP16量子化モデルを使用したORTモデル
|
| 78 |
+
- `ort_models/model_int8.ort`: INT8量子化モデルを使用したORTモデル
|
| 79 |
+
- `ort_models/model_uint8.ort`: UINT8量子化モデルを使用したORTモデル
|
| 80 |
+
|
| 81 |
+
## 注意事項
|
| 82 |
+
元のモデル [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased) のライセンスおよび使用条件を遵守してください。
|
| 83 |
+
|
| 84 |
+
## 貢献
|
| 85 |
+
問題や改善点があれば、Issueを作成するかプルリクエストを送ってください。
|
onnx_models/model.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:138dcb7d1d4fd4e42c44ed758593cc49cdf2d1fcebce36dbca401eadcbf15324
|
| 3 |
+
size 1340995544
|
onnx_models/model_fp16.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4c36d9735e7cacc60eef8dd409999fe8a67c3d87a7828bae6156145774f70f01
|
| 3 |
+
size 670783496
|
onnx_models/model_int8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95449f414e35459b1de4926a27ba2a4b93fa6cbcfc079e6f68f63615475b9bf2
|
| 3 |
+
size 336791929
|
onnx_models/model_opt.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0ae5405b93deb751870d94bf7563d2bd8a88768c0cf1a0f58b0d6c66b83ee31e
|
| 3 |
+
size 1340944121
|
onnx_models/model_uint8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:466fddb06dc82487552fe47279680e8f73edbcc0e13f71fe03617086f9e670f1
|
| 3 |
+
size 336791999
|
ort_models/model.ort
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8382162b1314c87fbf6552b7ddccc57e6a1d215a1735ae7d4abb866edfc5a4fb
|
| 3 |
+
size 1341319064
|
ort_models/model_fp16.ort
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:01adc4a16fe4e401158a712c7178ef9fd37d9694c2261d07b1417be6e9ac7a3b
|
| 3 |
+
size 671862968
|
ort_models/model_int8.ort
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:89cf1ce0f93261133eb09da2fb957eab7f8bfd9732033e8a123de5038aad177f
|
| 3 |
+
size 337105440
|
ort_models/model_uint8.ort
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8729fd7b2d70061ed3a6c6b3bbbea2d3e9af87ac7468ca1b70835f14aaed0b6b
|
| 3 |
+
size 337105440
|