| &&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_l_fp32.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_l_fp32.onnx.best.engine | |
| [12/28/2023-12:58:38] [I] === Model Options === | |
| [12/28/2023-12:58:38] [I] Format: ONNX | |
| [12/28/2023-12:58:38] [I] Model: yolo_nas_pose_l_fp32.onnx | |
| [12/28/2023-12:58:38] [I] Output: | |
| [12/28/2023-12:58:38] [I] === Build Options === | |
| [12/28/2023-12:58:38] [I] Max batch: explicit batch | |
| [12/28/2023-12:58:38] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default | |
| [12/28/2023-12:58:38] [I] minTiming: 1 | |
| [12/28/2023-12:58:38] [I] avgTiming: 8 | |
| [12/28/2023-12:58:38] [I] Precision: FP32+FP16+INT8 | |
| [12/28/2023-12:58:38] [I] LayerPrecisions: | |
| [12/28/2023-12:58:38] [I] Calibration: Dynamic | |
| [12/28/2023-12:58:38] [I] Refit: Disabled | |
| [12/28/2023-12:58:38] [I] Sparsity: Disabled | |
| [12/28/2023-12:58:38] [I] Safe mode: Disabled | |
| [12/28/2023-12:58:38] [I] DirectIO mode: Disabled | |
| [12/28/2023-12:58:38] [I] Restricted mode: Disabled | |
| [12/28/2023-12:58:38] [I] Build only: Disabled | |
| [12/28/2023-12:58:38] [I] Save engine: yolo_nas_pose_l_fp32.onnx.best.engine | |
| [12/28/2023-12:58:38] [I] Load engine: | |
| [12/28/2023-12:58:38] [I] Profiling verbosity: 0 | |
| [12/28/2023-12:58:38] [I] Tactic sources: Using default tactic sources | |
| [12/28/2023-12:58:38] [I] timingCacheMode: local | |
| [12/28/2023-12:58:38] [I] timingCacheFile: | |
| [12/28/2023-12:58:38] [I] Heuristic: Disabled | |
| [12/28/2023-12:58:38] [I] Preview Features: Use default preview flags. | |
| [12/28/2023-12:58:38] [I] Input(s)s format: fp32:CHW | |
| [12/28/2023-12:58:38] [I] Output(s)s format: fp32:CHW | |
| [12/28/2023-12:58:38] [I] Input build shapes: model | |
| [12/28/2023-12:58:38] [I] Input calibration shapes: model | |
| [12/28/2023-12:58:38] [I] === System Options === | |
| [12/28/2023-12:58:38] [I] Device: 0 | |
| [12/28/2023-12:58:38] [I] DLACore: | |
| [12/28/2023-12:58:38] [I] Plugins: | |
| [12/28/2023-12:58:38] [I] === Inference Options === | |
| [12/28/2023-12:58:38] [I] Batch: Explicit | |
| [12/28/2023-12:58:38] [I] Input inference shapes: model | |
| [12/28/2023-12:58:38] [I] Iterations: 10 | |
| [12/28/2023-12:58:38] [I] Duration: 15s (+ 200ms warm up) | |
| [12/28/2023-12:58:38] [I] Sleep time: 0ms | |
| [12/28/2023-12:58:38] [I] Idle time: 0ms | |
| [12/28/2023-12:58:38] [I] Streams: 1 | |
| [12/28/2023-12:58:38] [I] ExposeDMA: Disabled | |
| [12/28/2023-12:58:38] [I] Data transfers: Enabled | |
| [12/28/2023-12:58:38] [I] Spin-wait: Disabled | |
| [12/28/2023-12:58:38] [I] Multithreading: Disabled | |
| [12/28/2023-12:58:38] [I] CUDA Graph: Disabled | |
| [12/28/2023-12:58:38] [I] Separate profiling: Disabled | |
| [12/28/2023-12:58:38] [I] Time Deserialize: Disabled | |
| [12/28/2023-12:58:38] [I] Time Refit: Disabled | |
| [12/28/2023-12:58:38] [I] NVTX verbosity: 0 | |
| [12/28/2023-12:58:38] [I] Persistent Cache Ratio: 0 | |
| [12/28/2023-12:58:38] [I] Inputs: | |
| [12/28/2023-12:58:38] [I] === Reporting Options === | |
| [12/28/2023-12:58:38] [I] Verbose: Disabled | |
| [12/28/2023-12:58:38] [I] Averages: 100 inferences | |
| [12/28/2023-12:58:38] [I] Percentiles: 90,95,99 | |
| [12/28/2023-12:58:38] [I] Dump refittable layers:Disabled | |
| [12/28/2023-12:58:38] [I] Dump output: Disabled | |
| [12/28/2023-12:58:38] [I] Profile: Disabled | |
| [12/28/2023-12:58:38] [I] Export timing to JSON file: | |
| [12/28/2023-12:58:38] [I] Export output to JSON file: | |
| [12/28/2023-12:58:38] [I] Export profile to JSON file: | |
| [12/28/2023-12:58:38] [I] | |
| [12/28/2023-12:58:38] [I] === Device Information === | |
| [12/28/2023-12:58:38] [I] Selected Device: Orin | |
| [12/28/2023-12:58:38] [I] Compute Capability: 8.7 | |
| [12/28/2023-12:58:38] [I] SMs: 8 | |
| [12/28/2023-12:58:38] [I] Compute Clock Rate: 0.624 GHz | |
| [12/28/2023-12:58:38] [I] Device Global Memory: 7471 MiB | |
| [12/28/2023-12:58:38] [I] Shared Memory per SM: 164 KiB | |
| [12/28/2023-12:58:38] [I] Memory Bus Width: 128 bits (ECC disabled) | |
| [12/28/2023-12:58:38] [I] Memory Clock Rate: 0.624 GHz | |
| [12/28/2023-12:58:38] [I] | |
| [12/28/2023-12:58:38] [I] TensorRT version: 8.5.2 | |
| [12/28/2023-12:58:43] [I] [TRT] [MemUsageChange] Init CUDA: CPU +220, GPU +0, now: CPU 249, GPU 3010 (MiB) | |
| [12/28/2023-12:58:48] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +302, GPU +283, now: CPU 574, GPU 3313 (MiB) | |
| [12/28/2023-12:58:48] [I] Start parsing network model | |
| [12/28/2023-12:58:51] [I] [TRT] ---------------------------------------------------------------- | |
| [12/28/2023-12:58:51] [I] [TRT] Input filename: yolo_nas_pose_l_fp32.onnx | |
| [12/28/2023-12:58:51] [I] [TRT] ONNX IR version: 0.0.8 | |
| [12/28/2023-12:58:51] [I] [TRT] Opset version: 17 | |
| [12/28/2023-12:58:51] [I] [TRT] Producer name: pytorch | |
| [12/28/2023-12:58:51] [I] [TRT] Producer version: 2.1.2 | |
| [12/28/2023-12:58:51] [I] [TRT] Domain: | |
| [12/28/2023-12:58:51] [I] [TRT] Model version: 0 | |
| [12/28/2023-12:58:51] [I] [TRT] Doc string: | |
| [12/28/2023-12:58:51] [I] [TRT] ---------------------------------------------------------------- | |
| [12/28/2023-12:58:51] [I] Finish parsing network model | |
| [12/28/2023-12:58:52] [I] [TRT] ---------- Layers Running on DLA ---------- | |
| [12/28/2023-12:58:52] [I] [TRT] ---------- Layers Running on GPU ---------- | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation1] | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/pre_process/pre_process.0/Cast.../pre_process/pre_process.2/Mul]} | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 455) [Constant] | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 456) [Constant] | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 457) [Constant] | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stem/conv/rbr_reparam/Conv + /model/backbone/stem/conv/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/downsample/rbr_reparam/Conv + /model/backbone/stage1/downsample/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/blocks/conv2/conv/Conv + /model/backbone/stage1/blocks/conv2/act/Relu || /model/backbone/stage1/blocks/conv1/conv/Conv + /model/backbone/stage1/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 15) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 23) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/conv1/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage1/blocks/conv3/conv/Conv + /model/backbone/stage1/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/reduce_skip2/conv/Conv + /model/neck/neck2/reduce_skip2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/downsample/rbr_reparam/Conv + /model/backbone/stage2/downsample/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/downsample/conv/Conv + /model/neck/neck2/downsample/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/conv2/conv/Conv + /model/backbone/stage2/blocks/conv2/act/Relu || /model/backbone/stage2/blocks/conv1/conv/Conv + /model/backbone/stage2/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 44) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 52) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.2.alpha + (Unnamed Layer* 60) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/conv1/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage2/blocks/conv3/conv/Conv + /model/backbone/stage2/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/reduce_skip2/conv/Conv + /model/neck/neck1/reduce_skip2/act/Relu || /model/neck/neck2/reduce_skip1/conv/Conv + /model/neck/neck2/reduce_skip1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/downsample/rbr_reparam/Conv + /model/backbone/stage3/downsample/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/downsample/conv/Conv + /model/neck/neck1/downsample/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/conv2/conv/Conv + /model/backbone/stage3/blocks/conv2/act/Relu || /model/backbone/stage3/blocks/conv1/conv/Conv + /model/backbone/stage3/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 83) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 91) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.2.alpha + (Unnamed Layer* 99) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.3.alpha + (Unnamed Layer* 107) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv2/rbr_reparam/Conv + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.4.alpha + (Unnamed Layer* 115) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/conv1/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage3/blocks/conv3/conv/Conv + /model/backbone/stage3/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/reduce_skip1/conv/Conv + /model/neck/neck1/reduce_skip1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/downsample/rbr_reparam/Conv + /model/backbone/stage4/downsample/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/blocks/conv2/conv/Conv + /model/backbone/stage4/blocks/conv2/act/Relu || /model/backbone/stage4/blocks/conv1/conv/Conv + /model/backbone/stage4/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 134) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 142) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/conv1/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Add_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/stage4/blocks/conv3/conv/Conv + /model/backbone/stage4/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/context_module/cv1/conv/Conv + /model/backbone/context_module/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.2/MaxPool | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.1/MaxPool | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.0/MaxPool | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/backbone/context_module/cv1/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/backbone/context_module/cv2/conv/Conv + /model/backbone/context_module/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/conv/conv/Conv + /model/neck/neck1/conv/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] DECONVOLUTION: /model/neck/neck1/upsample/ConvTranspose | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/reduce_after_concat/conv/Conv + /model/neck/neck1/reduce_after_concat/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/conv2/conv/Conv + /model/neck/neck1/blocks/conv2/act/Relu || /model/neck/neck1/blocks/conv1/conv/Conv + /model/neck/neck1/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 171) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 179) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.2/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.2/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.2.alpha + (Unnamed Layer* 187) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.2/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.2/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.3/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv + /model/neck/neck1/blocks/bottlenecks/bottlenecks.3/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.3.alpha + (Unnamed Layer* 195) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.3/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.3/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck1/blocks/conv3/conv/Conv + /model/neck/neck1/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/conv/conv/Conv + /model/neck/neck2/conv/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] DECONVOLUTION: /model/neck/neck2/upsample/ConvTranspose | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/reduce_skip1/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/reduce_after_concat/conv/Conv + /model/neck/neck2/reduce_after_concat/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/conv2/conv/Conv + /model/neck/neck2/blocks/conv2/act/Relu || /model/neck/neck2/blocks/conv1/conv/Conv + /model/neck/neck2/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 216) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 224) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.2/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.2/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.2.alpha + (Unnamed Layer* 232) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.2/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.2/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.3/cv1/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv + /model/neck/neck2/blocks/bottlenecks/bottlenecks.3/cv2/nonlinearity/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.3.alpha + (Unnamed Layer* 240) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.3/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.3/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck2/blocks/conv3/conv/Conv + /model/neck/neck2/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/bbox_stem/seq/conv/Conv + /model/heads/head1/bbox_stem/seq/act/Relu || /model/heads/head1/pose_stem/seq/conv/Conv + /model/heads/head1/pose_stem/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/conv/conv/Conv + /model/neck/neck3/conv/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/reg_convs/reg_convs.0/seq/conv/Conv + /model/heads/head1/reg_convs/reg_convs.0/seq/act/Relu || /model/heads/head1/cls_convs/cls_convs.0/seq/conv/Conv + /model/heads/head1/cls_convs/cls_convs.0/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/pose_convs/pose_convs.0/seq/conv/Conv + /model/heads/head1/pose_convs/pose_convs.0/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/conv2/conv/Conv + /model/neck/neck3/blocks/conv2/act/Relu || /model/neck/neck3/blocks/conv1/conv/Conv + /model/neck/neck3/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/cls_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/reg_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/pose_convs/pose_convs.1/seq/conv/Conv + /model/heads/head1/pose_convs/pose_convs.1/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape + /model/heads/Transpose | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head1/pose_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 271) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 294) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.2/cv1/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.2/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.2/cv2/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.2/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.2.alpha + (Unnamed Layer* 302) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.2/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.2/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.3/cv1/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.3/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/bottlenecks/bottlenecks.3/cv2/conv/Conv + /model/neck/neck3/blocks/bottlenecks/bottlenecks.3/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.3.alpha + (Unnamed Layer* 310) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.3/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.3/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck3/blocks/conv3/conv/Conv + /model/neck/neck3/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/pose_stem/seq/conv/Conv + /model/heads/head2/pose_stem/seq/act/Relu || /model/heads/head2/bbox_stem/seq/conv/Conv + /model/heads/head2/bbox_stem/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/conv/conv/Conv + /model/neck/neck4/conv/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/reg_convs/reg_convs.0/seq/conv/Conv + /model/heads/head2/reg_convs/reg_convs.0/seq/act/Relu || /model/heads/head2/cls_convs/cls_convs.0/seq/conv/Conv + /model/heads/head2/cls_convs/cls_convs.0/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/pose_convs/pose_convs.0/seq/conv/Conv + /model/heads/head2/pose_convs/pose_convs.0/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/conv2/conv/Conv + /model/neck/neck4/blocks/conv2/act/Relu || /model/neck/neck4/blocks/conv1/conv/Conv + /model/neck/neck4/blocks/conv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/cls_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/reg_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/pose_convs/pose_convs.1/seq/conv/Conv + /model/heads/head2/pose_convs/pose_convs.1/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_4 + /model/heads/Transpose_3 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head2/pose_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_1 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 341) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_1 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 364) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.2/cv1/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.2/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.2/cv2/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.2/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.2.alpha + (Unnamed Layer* 372) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.2/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.2/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.3/cv1/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.3/cv1/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/bottlenecks/bottlenecks.3/cv2/conv/Conv + /model/neck/neck4/blocks/bottlenecks/bottlenecks.3/cv2/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.3.alpha + (Unnamed Layer* 380) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.3/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.3/Add) | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/conv2/act/Relu_output_0 copy | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/neck/neck4/blocks/conv3/conv/Conv + /model/neck/neck4/blocks/conv3/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/bbox_stem/seq/conv/Conv + /model/heads/head3/bbox_stem/seq/act/Relu || /model/heads/head3/pose_stem/seq/conv/Conv + /model/heads/head3/pose_stem/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/reg_convs/reg_convs.0/seq/conv/Conv + /model/heads/head3/reg_convs/reg_convs.0/seq/act/Relu || /model/heads/head3/cls_convs/cls_convs.0/seq/conv/Conv + /model/heads/head3/cls_convs/cls_convs.0/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/pose_convs/pose_convs.0/seq/conv/Conv + /model/heads/head3/pose_convs/pose_convs.0/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/cls_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/reg_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/pose_convs/pose_convs.1/seq/conv/Conv + /model/heads/head3/pose_convs/pose_convs.1/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_8 + /model/heads/Transpose_6 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/pose_convs/pose_convs.2/seq/conv/Conv + /model/heads/head3/pose_convs/pose_convs.2/seq/act/Relu | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_2 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/head3/pose_pred/Conv | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_2 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice_1.../post_process/Reshape_2]} | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] NMS: batched_nms_26 | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] DEVICE_TO_SHAPE_HOST: (Unnamed Layer* 459) [NMS]_1_output[DevicetoShapeHostCopy] | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation2] | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice...graph2_/Concat_5]} | |
| [12/28/2023-12:58:52] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation3] | |
| [12/28/2023-12:59:03] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +534, GPU +411, now: CPU 1351, GPU 3918 (MiB) | |
| [12/28/2023-12:59:05] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +80, now: CPU 1433, GPU 3998 (MiB) | |
| [12/28/2023-12:59:05] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. | |
| [12/28/2023-15:09:20] [I] [TRT] Total Activation Memory: 7964877312 | |
| [12/28/2023-15:09:20] [I] [TRT] Detected 1 inputs and 1 output network tensors. | |
| [12/28/2023-15:09:38] [I] [TRT] Total Host Persistent Memory: 331680 | |
| [12/28/2023-15:09:38] [I] [TRT] Total Device Persistent Memory: 38912 | |
| [12/28/2023-15:09:38] [I] [TRT] Total Scratch Memory: 134217728 | |
| [12/28/2023-15:09:38] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 88 MiB, GPU 2110 MiB | |
| [12/28/2023-15:09:38] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 176 steps to complete. | |
| [12/28/2023-15:09:38] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 114.49ms to assign 14 blocks to 176 nodes requiring 147384320 bytes. | |
| [12/28/2023-15:09:38] [I] [TRT] Total Activation Memory: 147384320 | |
| [12/28/2023-15:09:47] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU -15, now: CPU 1838, GPU 5747 (MiB) | |
| [12/28/2023-15:09:47] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +53, GPU +64, now: CPU 53, GPU 64 (MiB) | |
| [12/28/2023-15:09:48] [I] Engine built in 7870.12 sec. | |
| [12/28/2023-15:09:48] [I] [TRT] Loaded engine size: 54 MiB | |
| [12/28/2023-15:09:48] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1299, GPU 5509 (MiB) | |
| [12/28/2023-15:09:48] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +52, now: CPU 0, GPU 52 (MiB) | |
| [12/28/2023-15:09:48] [I] Engine deserialized in 0.136755 sec. | |
| [12/28/2023-15:09:48] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1300, GPU 5509 (MiB) | |
| [12/28/2023-15:09:48] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +140, now: CPU 0, GPU 192 (MiB) | |
| [12/28/2023-15:09:48] [I] Setting persistentCacheLimit to 0 bytes. | |
| [12/28/2023-15:09:48] [I] Using random values for input onnx::Cast_0 | |
| [12/28/2023-15:09:48] [I] Created input binding for onnx::Cast_0 with dimensions 1x3x640x640 | |
| [12/28/2023-15:09:48] [I] Using random values for output graph2_flat_predictions | |
| [12/28/2023-15:09:48] [I] Created output binding for graph2_flat_predictions with dimensions -1x57 | |
| [12/28/2023-15:09:48] [I] Starting inference | |
| [12/28/2023-15:10:04] [I] Warmup completed 2 queries over 200 ms | |
| [12/28/2023-15:10:04] [I] Timing trace has 712 queries over 15.0201 s | |
| [12/28/2023-15:10:04] [I] | |
| [12/28/2023-15:10:04] [I] === Trace details === | |
| [12/28/2023-15:10:04] [I] Trace averages of 100 runs: | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 21.1141 ms - Host latency: 21.2281 ms (enqueue 21.1848 ms) | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 21.2938 ms - Host latency: 21.4086 ms (enqueue 21.3535 ms) | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 20.5876 ms - Host latency: 20.6987 ms (enqueue 20.679 ms) | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 20.9284 ms - Host latency: 21.0399 ms (enqueue 20.9968 ms) | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 21.3846 ms - Host latency: 21.5023 ms (enqueue 21.4432 ms) | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 20.5315 ms - Host latency: 20.6422 ms (enqueue 20.6192 ms) | |
| [12/28/2023-15:10:04] [I] Average on 100 runs - GPU latency: 20.7566 ms - Host latency: 20.8657 ms (enqueue 20.8177 ms) | |
| [12/28/2023-15:10:04] [I] | |
| [12/28/2023-15:10:04] [I] === Performance summary === | |
| [12/28/2023-15:10:04] [I] Throughput: 47.4032 qps | |
| [12/28/2023-15:10:04] [I] Latency: min = 19.6377 ms, max = 32.405 ms, mean = 21.0632 ms, median = 20.583 ms, percentile(90%) = 21.897 ms, percentile(95%) = 23.0127 ms, percentile(99%) = 29.6182 ms | |
| [12/28/2023-15:10:04] [I] Enqueue Time: min = 19.6035 ms, max = 33.8328 ms, mean = 21.0211 ms, median = 20.5366 ms, percentile(90%) = 21.8384 ms, percentile(95%) = 22.998 ms, percentile(99%) = 29.0708 ms | |
| [12/28/2023-15:10:04] [I] H2D Latency: min = 0.0800781 ms, max = 0.128906 ms, mean = 0.0964459 ms, median = 0.097168 ms, percentile(90%) = 0.0991211 ms, percentile(95%) = 0.0996094 ms, percentile(99%) = 0.110474 ms | |
| [12/28/2023-15:10:04] [I] GPU Compute Time: min = 19.5264 ms, max = 32.2937 ms, mean = 20.9506 ms, median = 20.4727 ms, percentile(90%) = 21.7739 ms, percentile(95%) = 22.8984 ms, percentile(99%) = 29.5049 ms | |
| [12/28/2023-15:10:04] [I] D2H Latency: min = 0.00341797 ms, max = 0.0615234 ms, mean = 0.0161761 ms, median = 0.0136719 ms, percentile(90%) = 0.0258789 ms, percentile(95%) = 0.0273438 ms, percentile(99%) = 0.03125 ms | |
| [12/28/2023-15:10:04] [I] Total Host Walltime: 15.0201 s | |
| [12/28/2023-15:10:04] [I] Total GPU Compute Time: 14.9168 s | |
| [12/28/2023-15:10:04] [I] Explanations of the performance metrics are printed in the verbose logs. | |
| [12/28/2023-15:10:04] [I] | |
| &&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_l_fp32.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_l_fp32.onnx.best.engine | |