arjun.a
rename files
6c7b14a
Ticket Name: TDA2: ARP32 refers DDR memory address where it should not.
Query Text:
Part Number: TDA2 Hi, I work in a project in which I use VCOP and ARP32 together. The codeflow is as follows : Kernel 1 ( VCOP / KernelC) -> Kernel 2 (ARP32 / Natural C) -> Kernel 3 ( VCOP / KernelC). I need to use the output buffer of K1 as input buffer in K2. Issue: The output buffer of K1 is in on-chip memory of EVE. (0x400402E0) But, the input buffer in K2 is refering the DDR memory address (0x80000120) , even though the expectation is both the buffers should refer the same on-chip memory address. It is not happening. Question: 1. Can you please let us know, why the input buffer of K2 changed into DDR instead of refering on-chip memory? 2. Can you suggest methods to make ARP32 to read data from on-chip memory only? Regards, Surbhi
Responses:
Hi Surbhi, How are you exchanging the buffer addresses? Regards, Rishabh
Hi, //How are you exchanging the buffer addresses? I am not exchanging the buffer addresses across kernels. Actually, I want to know how the addresses are changed from on-chip memory address to DDR memory address between kernels. Regard, Surbhi
Surbhi, Regarding the following : The codeflow is as follows : Kernel 1 ( VCOP / KernelC) -> Kernel 2 (ARP32 / Natural C) -> Kernel 3 ( VCOP / KernelC). I need to use the output buffer of K1 as input buffer in K2. It is important to note that VCOP can only write to only the internal buffers IBUF, WBUF of EVE. It cannot access DDR/DMEM. Assuming that WBUF is always owned by VCOP, kernel1 would be writing to IBUFL or IBUFH. Now if kernel 2 wants to access these buffer you first need to switch the ownership of the buffer before ARP32 can access it. I hope you have accounted for that in your design. Now regarding kernel2 using an external buffer, can you tell how are you running kernel 2, is it a BAM node in the graph with core type set as BAM_EVE_ARP32? Regards, Anshu
Hi, Thanks for the reply. //It is important to note that VCOP can only write to only the internal buffers IBUF, WBUF of EVE. It cannot access DDR/DMEM. Assuming that WBUF is always owned by VCOP, kernel1 would be writing to IBUFL or IBUFH. Just for clarification, the K1 writes in WBUF ONLY. And not in IBUFL/IBUFH. //Now if kernel 2 wants to access these buffer you first need to switch the ownership of the buffer before ARP32 can access it. So, your idea is that the same output buffer presents in WBUF can be accessed by K2 (ARP32). Can you please let us know, how to switch the ownership of the buffer before ARP32 can access it. //Now regarding kernel2 using an external buffer, can you tell how are you running kernel 2, is it a BAM node in the graph with core type set as BAM_EVE_ARP32? Yes, I use BAM_EVE_ARP32 as the core type. Regards, Surbhi
Updated :: //Now regarding kernel2 using an external buffer, can you tell how are you running kernel 2, is it a BAM node in the graph with core type set as BAM_EVE_ARP32? Yes, I use BAM_EVE_ARP32 as the core type. Regards, Surbhi
Hi, //It is important to note that VCOP can only write to only the internal buffers IBUF, WBUF of EVE. It cannot access DDR/DMEM. For clarification, the Kernel1 which is running in VCOP writes the output data in WBUF only. //Assuming that WBUF is always owned by VCOP, kernel1 would be writing to IBUFL or IBUFH. Can you please let me know how can we ensure that WBUF is always owned by VCOP? //Now if kernel 2 wants to access these buffer you first need to switch the ownership of the buffer before ARP32 can access it. Can you please let me know, how to switch the ownership of the buffer before ARP32 can access it. //Now regarding kernel2 using an external buffer, can you tell how are you running kernel 2, is it a BAM node in the graph with core type set as BAM_EVE_ARP32? Yes, we use BAM node in the graph with core type set as BAM_EVE_ARP32. Regards, Surbhi
Surbhi, Does kernel 2 uses the data output generated by kernel 1? You can use VCOP_BUF_SWITCH_SET macro to set the ownership of the EVE's internal buffers. An example is as follows : VCOP_BUF_SWITCH_SET(WBUF_VCOP, IBUFHB_VCOP, IBUFLB_VCOP, IBUFHA_SYST, IBUFLA_SYST); This set WBUF, IBUFHB and IBUFLB ownership to VCOP and IBUFHA, IBUFLA to system. I will get back to you on usage of BAM_EVE_ARP32. Regards, Anshu
Hi, 1. //Does kernel 2 uses the data output generated by kernel 1? Yes, I want the Kernel2 needs to use the data output generated by Kernel1. Let us say, that Kernel 1 writes the output data at WBUF. Then, I want the Kernel2 needs to take that WBUF buffer for its processing. For that purpose. shall I use the command as follows after Kernel1 processing completed? VCOP_BUF_SWITCH_SET(WBUF_SYST, IBUFHB_VCOP, IBUFLB_VCOP, IBUFHA_SYST, IBUFLA_SYST); And, at the end of Kernel 2 proceesing, I return back the ownership of buffers as, VCOP_BUF_SWITCH_SET(WBUF_VCOP, IBUFHB_VCOP, IBUFLB_VCOP, IBUFHA_SYST, IBUFLA_SYST); 2. In my code, Kernel2 takes the input data from DDR. If I use the switch command as I mentioned, will Kernel2 stop using DDR and use only WBUF? 3. One more doubt. Kernel2 is set to execute in ARP32. Can you please tell me that why Kernel2 uses DDR instead of on chip memory? Regards, Surbhi
Hi Surbhi, You don't have to return back the ownership of the buffers as the framework automatically does so. See line 1052 and 1062 in function BAM_ARP32_computeWrapper() of bam_execute.c . Regarding the 2) and 3), the memory should never be allocated in DDR so it is strange you are getting this behaviour. What is the value of the outBlock[].space you set for K1 output ?I guess you must have set it to BAM_MEMSPACE_WBUF. Also which version of EVE sw release you are using ? regards, Victor
Hi Victor, //I guess you must have set it to BAM_MEMSPACE_WBUF. -->yes, for all the kernels the outblock[].space is set to BAM_MEMSPACE_WBUF. //Also which version of EVE sw release you are using ? --> eve_sw_01_18_01_00 & processor sdk version PROCESSOR_SDK_VISION_03_03_00_00 //Assuming that WBUF is always owned by VCOP, kernel1 would be writing to IBUFL or IBUFH. [Jan 3, 2019 1:00 PM] Can you please let me know how can we ensure that WBUF is always owned by VCOP? Regards Surbhi
Hi Surbhi, WBUF is always owned by VCOP by default. It is never owned by SYS=ARP32/EDMA, unless you explicitly call VCOP_BUF_SWITCH_SET(). So that's why if you have kernel that operates on ARP32, you need to call VCOP_BUF_SWITCH_SET() to switch the ownership of WBUF back to SYS in order for ARP32 to operate on it. Regarding the issue of the memory allocated in DDR, instead of WBUF, I think it needs deeper investigation. Can you share with TI some code so we can reproduce the issue ? regards, Victor
Hi Victor, The following are the Demo code tasks and my observations. 1. In the first case, I created an app as Gaussian (Kernel1)(VCOP) -> Image Inversion (Kernel2)(ARP32). The output buffer of Kernel1 is passed as an input to Kernel2. Observation: The input buffer address at Kernel2 is in on-chip memory. 2. In the second case, I created an app as input kernel (Kernel1)(VCOP) -> Negative (Kernel2)(ARP32) input kernel (Kernel1)(VCOP) -> Erosion (Kernel3)(VCOP) Image Inversion (Kernel2)(ARP32) + Erosion (Kernel3)(VCOP) -> Merge (Kernel4)(VCOP) -> sink node. Observation: a. The input buffer for Kernel which runs in ARP32 is changing into DDR as follows. 0x40054000 input copy 0x40054000 erode 0x40055000 merge 0x80000120 negative b. The order of execution is as per the following execution log. algProcess start calling input copy in execute funs width : 64 height: 32 calling erode in BAM_Erode_initFrame execute funs width : 64 height: 32 calling Merge in execute funs width : 64 height: 32 calling negative in execute funs width : 64 height: 32 But, the expected code flow is from input copy the data is shared between negative kernel and erosion kernel. Then the outputs of negative kernel and erosion kernel are merged in merge kernel and moved to sink node. c. If we branch the data output from one kernel to more than one kernels and in which one of them is executed in ARP32, then the input buffer address for ARP32 kernel turns into DDR memory. Regards, Surbhi
Hi Victor, any updates? Regards, Surbhi
Hi Surbhi, There is a constrain listed in Chapter 5 'Current limitations' of BAM user's guide: Specifications of edge list. If one single port of an upstream node has more than one connection, which happens in case of a fork in the graph, then all these connections must be clustered together. For instance in the below image pyramid graph, every DS_NODE’s output port BAM_BLOCKAVERAGE2x2_OUTPUT_PORT has two connections: one to the next DS_NODE and one to SINK_NODE. These connections must appear one afte the other in edge list. If there is a connection involving another node or another port between them then graph creation would be incorrect. The chapter has some example of a valid and invalid edge list. Can you double-check if your code follows the directive ? Thanks. regards, Victor