whitphx HF Staff commited on
Commit
b84c2ea
Β·
verified Β·
1 Parent(s): d8b9bb8

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/encoder/layer.0/attention/self/key_conv_attn_layer/depthwise/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `model_uint8.onnx` (added)
↳ βœ… `q4`: `model_q4.onnx` (added)
↳ βœ… `q4f16`: `model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `model_bnb4.onnx` (added)

### ❌ Based on `model.onnx` *with* slimming

```
None
```
↳ ❌ `int8`: `model_int8.onnx` (added but JS-based E2E test failed)
```
/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25
__classPrivateFieldGet(this, _OnnxruntimeSessionHandler_inferenceSession, "f").loadModel(pathOrBuffer, options);
^

Error: Could not find an implementation for ConvInteger(10) node with name '/encoder/layer.0/attention/self/key_conv_attn_layer/depthwise/Conv_quant'
at new OnnxruntimeSessionHandler (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:25:92)
at Immediate.<anonymous> (/home/ubuntu/src/tjsmigration/node_modules/.pnpm/[email protected]/node_modules/onnxruntime-node/dist/backend.js:67:29)
at process.processImmediate (node:internal/timers:485:21)

Node.js v22.16.0
```
↳ βœ… `uint8`: `model_uint8.onnx` (added)
↳ βœ… `q4`: `model_q4.onnx` (added)
↳ βœ… `q4f16`: `model_q4f16.onnx` (added)
↳ βœ… `bnb4`: `model_bnb4.onnx` (added)

README.md CHANGED
@@ -7,15 +7,15 @@ https://huggingface.co/YituTech/conv-bert-base with ONNX weights to be compatibl
7
 
8
  ## Usage (Transformers.js)
9
 
10
- If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@xenova/transformers) using:
11
  ```bash
12
- npm i @xenova/transformers
13
  ```
14
 
15
  **Example:** Feature extraction w/ `Xenova/conv-bert-base`.
16
 
17
  ```javascript
18
- import { pipeline } from '@xenova/transformers';
19
 
20
  // Create feature extraction pipeline
21
  const extractor = await pipeline('feature-extraction', 'Xenova/conv-bert-base');
@@ -33,5 +33,4 @@ console.log(output)
33
 
34
  ---
35
 
36
-
37
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
7
 
8
  ## Usage (Transformers.js)
9
 
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
  ```bash
12
+ npm i @huggingface/transformers
13
  ```
14
 
15
  **Example:** Feature extraction w/ `Xenova/conv-bert-base`.
16
 
17
  ```javascript
18
+ import { pipeline } from '@huggingface/transformers';
19
 
20
  // Create feature extraction pipeline
21
  const extractor = await pipeline('feature-extraction', 'Xenova/conv-bert-base');
 
33
 
34
  ---
35
 
 
36
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [πŸ€— Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1081ce96b4653bd902703dff77eee001bccf5fcd599e1d5aedcb6c80b79bc78c
3
+ size 154677945
onnx/model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:28bfb5b8e60722357260e650cc29454d15c92c13319b8b9e2ba51a5b22ae1934
3
+ size 159558789
onnx/model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f52319f607f7e2f12aa0d835e18e4037eb8c3888e74089cf8502d83aa7fb972a
3
+ size 99527486
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4be0dd8fa8125aecd598ebb133716f26dfa083ef090d1366c1626a8748de1a20
3
+ size 106625276