TI_RAG_Demo_L3.1

Sleeping

TI_RAG_Demo_L3.1 / sample_embedding_folder2 /0803582.txt

arjun.a

rename files

6c7b14a about 1 year ago

5.64 kB

	Ticket Name: RTOS: I could not understand how each LAYER works in TIDL OD Usecase.

	Query Text:
	Tool/software: TI-RTOS Hi, In JDetNet example, the layersGroupId is set as 0 1 1 1 1 (all 1s) 1 1 1 1 2 0. In this case, EVE creates 1 input buffer and 2 output buffers. And DSP creates 2 input buffers and 1 output buffe as following the model architecture. However, if the layersGroupId is set as 0 1 1 1 (all 1s) 2 1 1 1 2 0 (there are two layers which works on DSP), EVE/DSP creates input/output buffers corresponding the model architecture. What i couldn't understand, if that case, how EVE waits to get the result of middle layer that is working on DSP? In TIDL OD data flow, Alg_tidl_EVEx works first and Alg_tidl_Dsp works after that. However, if there is a layer assigned to the DSP in the middle of the model structure, should not the next layer assigned to EVE wait to receive the result of the DSP layer as input? If the tidl od case is executed as data flow, I think this process is impossible. But, when i execute that model, the usecase goes well. How is it possible? Regards, Yoo.

	Responses:
	Hi, I read TIDeepLearningLibrary_UserGuide.pdf again. In FAQ21, it refered 'Condition : Only DectectionOutputLayer should run on DSP and rest of the all the layers on EVE in the SSD network.' . And when i test the model which have dsp in the middle layer of Network, the model couldn't detect object normally. I have determined that if the middle layer belongs to dsp, there is a problem in the buffer transfer process. When using the SSD model, should layers always be allocated to EVE except detectionoutputlayer, data layer and output layer? Thanks in advance. Regards, Yoo

	Hi Yoo, It is recommended to run only DetectionOutput layer on DSP and rest all the other layers on EVE. This is because the DetectionOutput layer is better optimized on DSP and all the other layers are better optimized on EVE. So, the same is tested in VSDK use case also. Thanks, Praveen

	Hi, Does that mean that the sequence of EVE and DSP layers does not matter? However, when I changed one middle layer of the successive layers in EVE to the DSP, the model almost lost performance. The below is before i changed. layersGroupId = 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0 And this is after i changed. layersGroupId = 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2(here) 1 1 1 1 1 1 1 1 1 2 0 If this change will not be a problem, Could you describe the process of their buffer exchange? According to the link sequence in tidl_od.jpg, Alg_tidl_dsp will be executed after Alg_tidl_eve, so how do the eve layer immediately behind the intermediate DSP layer receive the input buffer?

	Hi, The sequence of EVE and DSP layers does matter for performance, as explained in the previous post all the initial layers except the last detection output layer are better optimized on EVE and only the last layer (detection output layer) is better optimized on DSP. So, if you change any middle layer to DSP from EVE will result in performance degradation. Also, the current tidl_od use case is also designed to run all the initial layers on EVE and last detection output layer on DSP. Thanks, Praveen

	Hi, Thank you for answering. BTW, i have two more questions. 1) The performance degradation you referred, does it mean processing time and accuracy both? Actually, when i change a middle layer to DSP, The model detected the wrong place as an object. 2) Could you explain what is OCMC? In setting APP parameters, the EVEs have ocmcAddr. How they used by EVE? I refered the source code, and saw the EVEs init their own L1, L2 cache size. Is that for storing Network parameter? Or could you where the document about this question? Best Regards, Yoo.

	Hi, 1. The degradation is only in processing time as DSP consume more cycles than EVE , for processing for all layers except detection output layer. The wrong detection's could be because of some problem in your use case. 2. This OCMC is Level 3 (L3) memory. Yes, it is used for storing the parameters in TIDL. We don't have any detail document on this. Thanks, Praveen

	Hi, Thanks for answering. It really helps me. I have a last question about this thread. If all layers except the detection output layer are operating on eve, is there a way to see what the output of the last eve layer looks like? I mean I want to know the data format of the input and output of the detection output layer. Could you please let me know if there is any document or simple method? Thanks again. Best regards, Yoo.

	Hi, For the output format of the detection layer, you can see Draw Boxes (not exact function name, check similar) function where the detection output layer output is consumed to draw the output boxes. You can also refer to this below thread to understand output format of detection output layer (but this output is from standalone TIDL run not in the use case). e2e.ti.com/.../679186 The input format is.. from delpoy.prototxt.. layer { name: "detection_out" type: "DetectionOutput" bottom: "mbox_loc" bottom: "mbox_conf_flatten" bottom: "mbox_priorbox" top: "detection_out" include { phase: TEST } There are 2 inputs to the detection output layer, first input is locations buffer (flattened and concatenated from all the heads) and second one is conf scores buffer ( flattened and concatenated from all the heads but without reshape and softmax as these will be done in detection output layer it self). The prior boxes are calculated in the import time it self and stored in the parameters in the import tool. Hope this clarifies. Thanks, Praveen