qaihm-bot commited on
Commit
d47bc7b
·
verified ·
1 Parent(s): 8821240

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +12 -84
README.md CHANGED
@@ -25,16 +25,19 @@ More details on model performance across various devices, can be found
25
 
26
  - **Model Type:** Super resolution
27
  - **Model Stats:**
28
- - Model checkpoint: quicksrnet_small_4x_checkpoint_int8
29
- - Input resolution: 128x128
30
- - Number of parameters: 33.3K
31
- - Model size: 42.5 KB
 
 
32
 
33
 
34
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
35
  | ---|---|---|---|---|---|---|---|
36
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 0.95 ms | 0 - 2 MB | INT8 | NPU | [QuickSRNetSmall-Quantized.tflite](https://huggingface.co/qualcomm/QuickSRNetSmall-Quantized/blob/main/QuickSRNetSmall-Quantized.tflite)
37
- | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 0.668 ms | 0 - 2 MB | INT8 | NPU | [QuickSRNetSmall-Quantized.so](https://huggingface.co/qualcomm/QuickSRNetSmall-Quantized/blob/main/QuickSRNetSmall-Quantized.so)
 
38
 
39
 
40
  ## Installation
@@ -96,89 +99,14 @@ python -m qai_hub_models.models.quicksrnetsmall_quantized.export
96
  Profile Job summary of QuickSRNetSmall-Quantized
97
  --------------------------------------------------
98
  Device: Snapdragon X Elite CRD (11)
99
- Estimated Inference Time: 0.74 ms
100
- Estimated Peak Memory Range: 0.05-0.05 MB
101
  Compute Units: NPU (8) | Total (8)
102
 
103
 
104
  ```
105
- ## How does this work?
106
-
107
- This [export script](https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/QuickSRNetSmall-Quantized/export.py)
108
- leverages [Qualcomm® AI Hub](https://aihub.qualcomm.com/) to optimize, validate, and deploy this model
109
- on-device. Lets go through each step below in detail:
110
-
111
- Step 1: **Compile model for on-device deployment**
112
-
113
- To compile a PyTorch model for on-device deployment, we first trace the model
114
- in memory using the `jit.trace` and then call the `submit_compile_job` API.
115
-
116
- ```python
117
- import torch
118
-
119
- import qai_hub as hub
120
- from qai_hub_models.models.quicksrnetsmall_quantized import Model
121
-
122
- # Load the model
123
- torch_model = Model.from_pretrained()
124
- torch_model.eval()
125
-
126
- # Device
127
- device = hub.Device("Samsung Galaxy S23")
128
-
129
- # Trace model
130
- input_shape = torch_model.get_input_spec()
131
- sample_inputs = torch_model.sample_inputs()
132
-
133
- pt_model = torch.jit.trace(torch_model, [torch.tensor(data[0]) for _, data in sample_inputs.items()])
134
-
135
- # Compile model on a specific device
136
- compile_job = hub.submit_compile_job(
137
- model=pt_model,
138
- device=device,
139
- input_specs=torch_model.get_input_spec(),
140
- )
141
 
142
- # Get target model to run on-device
143
- target_model = compile_job.get_target_model()
144
-
145
- ```
146
-
147
-
148
- Step 2: **Performance profiling on cloud-hosted device**
149
-
150
- After compiling models from step 1. Models can be profiled model on-device using the
151
- `target_model`. Note that this scripts runs the model on a device automatically
152
- provisioned in the cloud. Once the job is submitted, you can navigate to a
153
- provided job URL to view a variety of on-device performance metrics.
154
- ```python
155
- profile_job = hub.submit_profile_job(
156
- model=target_model,
157
- device=device,
158
- )
159
-
160
- ```
161
-
162
- Step 3: **Verify on-device accuracy**
163
-
164
- To verify the accuracy of the model on-device, you can run on-device inference
165
- on sample input data on the same cloud hosted device.
166
- ```python
167
- input_data = torch_model.sample_inputs()
168
- inference_job = hub.submit_inference_job(
169
- model=target_model,
170
- device=device,
171
- inputs=input_data,
172
- )
173
-
174
- on_device_output = inference_job.download_output_data()
175
-
176
- ```
177
- With the output of the model, you can compute like PSNR, relative errors or
178
- spot check the output with expected output.
179
 
180
- **Note**: This on-device profiling and inference requires access to Qualcomm®
181
- AI Hub. [Sign up for access](https://myaccount.qualcomm.com/signup).
182
 
183
 
184
  ## Run demo on a cloud-hosted device
@@ -217,7 +145,7 @@ Explore all available models on [Qualcomm® AI Hub](https://aihub.qualcomm.com/)
217
  ## License
218
  - The license for the original implementation of QuickSRNetSmall-Quantized can be found
219
  [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
220
- - The license for the compiled assets for on-device deployment can be found [here]({deploy_license_url})
221
 
222
  ## References
223
  * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)
 
25
 
26
  - **Model Type:** Super resolution
27
  - **Model Stats:**
28
+ - Model checkpoint: quicksrnet_small_3x_checkpoint
29
+ - Input resolution: 640x360
30
+ - Number of parameters: 27.2K
31
+ - Model size: 34.9 KB
32
+
33
+
34
 
35
 
36
  | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Precision | Primary Compute Unit | Target Model
37
  | ---|---|---|---|---|---|---|---|
38
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | TFLite | 0.974 ms | 0 - 3 MB | INT8 | NPU | [QuickSRNetSmall-Quantized.tflite](https://huggingface.co/qualcomm/QuickSRNetSmall-Quantized/blob/main/QuickSRNetSmall-Quantized.tflite)
39
+ | Samsung Galaxy S23 Ultra (Android 13) | Snapdragon® 8 Gen 2 | QNN Model Library | 0.671 ms | 0 - 3 MB | INT8 | NPU | [QuickSRNetSmall-Quantized.so](https://huggingface.co/qualcomm/QuickSRNetSmall-Quantized/blob/main/QuickSRNetSmall-Quantized.so)
40
+
41
 
42
 
43
  ## Installation
 
99
  Profile Job summary of QuickSRNetSmall-Quantized
100
  --------------------------------------------------
101
  Device: Snapdragon X Elite CRD (11)
102
+ Estimated Inference Time: 0.72 ms
103
+ Estimated Peak Memory Range: 1.03-1.03 MB
104
  Compute Units: NPU (8) | Total (8)
105
 
106
 
107
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
 
 
110
 
111
 
112
  ## Run demo on a cloud-hosted device
 
145
  ## License
146
  - The license for the original implementation of QuickSRNetSmall-Quantized can be found
147
  [here](https://github.com/quic/aimet-model-zoo/blob/develop/LICENSE.pdf).
148
+ - The license for the compiled assets for on-device deployment can be found [here](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/Qualcomm+AI+Hub+Proprietary+License.pdf)
149
 
150
  ## References
151
  * [QuickSRNet: Plain Single-Image Super-Resolution Architecture for Faster Inference on Mobile Platforms](https://arxiv.org/abs/2303.04336)