Spaces:
Sleeping
Sleeping
Ticket Name: Linux/TDA2PXEVM: TDA2P custom board - NVMe SSD - DMA Usage | |
Query Text: | |
Part Number: TDA2PXEVM Tool/software: Linux Hi TI, we are having issues while writing files to SSD (Intel Optane 900P) over NVMe using our custom TDA2P board. Write speed is good but the CPU load is around 80 percent. We would like to see if we are using DMA/eDMA for this transfer. The SW running on TDA2p is linux/SDK. Do you have propositions for lowering this CPU load? Regards, Stefan. | |
Responses: | |
Hi Stefan, Can you please provide the commands used to write to SSD? Reason I ask is if commands such as dd are used it could involve mem copy increasing the load. Also can you provide a snapshot of top when the writes are performed. DMA can be used to write to the SSD, however with the SSD servicing as an EP the DMA writes must be triggered by the SSD (most end-points trigger DMA read / writes, host only programs registers to trigger events such as DMA copy). Regards Shravan | |
Hey Shravan, we are writing to a preallocated 8GB file (fallocate) and then just write (system write) to it. root@dra7xx-evm:/# top top - 12:01:23 up 6 min, 2 users, load average: 1.50, 0.66, 0.27 Tasks: 106 total, 3 running, 103 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 85.4 sy, 0.0 ni, 12.1 id, 2.1 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem : 1819728 total, 133908 free, 58388 used, 1627432 buff/cache KiB Swap: 0 total, 0 free, 0 used. 1715276 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1214 root 20 0 1208 456 288 R 75.0 0.0 0:35.31 a.out 75 root 20 0 0 0 0 R 70.1 0.0 0:08.41 kworker/u4:3 Regards, Stefan. | |
Hi Stefan, Can you confirm that you're A15 is running at 1.8GHz? Please set the scaling governor to "performance" by running the below command. echo "performance" > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor Regards Shravan | |
Hi Shravan, we have modified our .dts file so it runs on 1800MHz. Please note the bolded part of the following command. root@dra7xx-evm:~# omapconf show opp OMAPCONF (rev v1.73-17-g578778b built Thu Aug 31 13:16:54 IST 2017) HW Platform: Generic DRA74X (Flattened Device Tree) DRA76X ES1.0 GP Device (STANDARD performance (1.0GHz)) Error: I2C Read failed Error: I2C Read failed Error: I2C Read failed UNKNOWN POWER IC SW Build Details: Build: Version: _____ _____ _ _ Kernel: Version: 4.4.84 Author: root@rtrkn096-lin Toolchain: gcc version 5.3.1 20160113 (Linaro GCC 5.3-2016.02) Type: #5 SMP PREEMPT Date: Thu Aug 16 14:47:18 CEST 2018 |-----------------------------------------------------------------------------------| | | Temperature | Voltage | Frequency | OPerating Point | |-----------------------------------------------------------------------------------| | VDD_CORE / VDD_CORE0 | 42C / 107F | NA | | NOM | | L3 | | | 266 MHz | | | DMM | | | 266 MHz | | | EMIF1 | | | 266 MHz | | | EMIF2 | | | 266 MHz | | | LP-DDR2 | | | 666 MHz | | | L4 | | | 266 MHz | | | IPU1 | | | (2128 MHz) (1) | | | Cortex-M4 Cores | | | (1064 MHz) (1) | | | IPU2 | | | 2128 MHz | | | Cortex-M4 Cores | | | 1064 MHz | | | DSS | | | 192 MHz | | | BB2D | | | (2128 MHz) (1) | | | | | | | | | VDD_MPU / VDD_CORE1 | 43C / 109F | NA | | PLUS | | MPU (CPU1 ON) | | | 1800 MHz | | | | | | | | | VDD_GPU / VDD_CORE2 | 42C / 107F | NA | | HIGH | | GPU | | | 532 MHz | | | | | | | | | VDD_DSPEVE / VDD_CORE3 | 41C / 105F | NA | | NOM | | DSP1 | | | 750 MHz | | | DSP2 | | | 750 MHz | | | EVE1 | | | 535 MHz | | | EVE2 | | | 535 MHz | | | | | | | | | VDD_IVA / VDD_CORE4 | 43C / 109F | NA | | HIGH | | IVA | | | 532 MHz | | | | | | | | |-----------------------------------------------------------------------------------| Notes: (1) Module is disabled, rate may not be relevant. Regards, Stefan. | |
Hi Stefan, I've had the look at the driver and below are some observations / comments: 1. I don't think using DMA will decrease the sytem load. Since the SSD card acts as an endpoint, DMA is initiated from the SSD (and not the TDA2P board) 2. The load in the system could be due to the copies involved between user and kernel space To avoid user-space copies, you can use the splice commands. In your final use-case, you want to write camera streams to the SSD, the data from the camera streams is exported to Linux as a DMA-buf file-descriptor (refer Documentation/virt-mem-export.txt in the Linux kernel and /hlos/src/links/ipcIn/ipcInLink_drv.c in VSDK). Since the input is also a file, splice is a classic command to copy data between two files without copy between user-space and kernel space. You can find more information here. blog.plenz.com/.../so-you-want-to-write-to-a-file-real-fast.html Please note let the output file (file written to SSD), needs to still be pre-allocated using fallocate (in-fact comparison with and without fallocate is mentioned in the above blog-post). Regards Shravan | |