Spaces:
Sleeping
Sleeping
Ticket Name: CCS/TDA2: ddr l2 l1 Speed comparison | |
Query Text: | |
Part Number: TDA2 Tool/software: Code Composer Studio from spna165.pdf L1 cache 600 MHz L2 cache 300 MHz External memory~100 MHZ memory so i test this use the same function ,change site of the in put buffer and comparison use cycles this is the cmd -stack 0x4000 -heap 0x2000000 MEMORY { L1P_SRAM : origin = 0x00E00000, len = 0x8000 L1D_SRAM : origin = 0x00F00000, len = 0x8000 /* 16 KB SRAM */ // L1D_CACHE : origin = 0x00F04000, len = 0x4000 /* 16 KB cache */ // L1D_CACHE : origin = 0x00F00000, len = 0x8000 /* 16 KB cache */ L2_SRAM : origin = 0x00800000, len = 0x48000 /* SARAM in L2, = 256 + 32 - 128 = 160 KB*/ // L2_CACHE : origin = 0x00828000, len = 0x20000 /* Cache for L2, which is configured as 128 KB*/ DSP2_L2_SRAM : origin = 0x40800000, len = 0x48000 SL2_SRAM : origin = 0x5B000000, len = 0x40000 EXT_MEM_CACHE : origin = 0x80000000, len = 0x06000000 /* DSP Used cachable area */ EXT_MEM_heap : origin = 0x86000000, len = 0x02000000 /* DSP Used cachable area */ } SECTIONS { vectors :> EXT_MEM_CACHE .cio :> EXT_MEM_CACHE .bss :> EXT_MEM_CACHE ////usually reserves space for uninitialized variables .text :> EXT_MEM_CACHE //////contains executable code .cinit :> EXT_MEM_CACHE .const :> EXT_MEM_CACHE .far :> EXT_MEM_CACHE .fardata :> EXT_MEM_CACHE /////usually contains initialized data .neardata :> EXT_MEM_CACHE ///////usually contains initialized data .rodata :> EXT_MEM_CACHE .sysmem :> EXT_MEM_CACHE .switch :> EXT_MEM_CACHE .L2SramSect :> L2_SRAM .stack :> L2_SRAM .heap :> EXT_MEM_heap } input = (int *)0x00F00000; for L1 1300000 input = (int *)0x00800000; for L2 1300000 input = (int *)0x80000000; for DDR 12000000 result is L1 and L2 same DDR much more so problem is why L1 not faster than L2? | |
Responses: | |
The cycles depend very much on what we are trying to do in the function. If you have to measure acutal cycles then you need to write a specific test case. For measuring L1D performance, 1. Partition L1D as SRAM and cache. 2. Declare an array of lengh (say 1KB) and pipe it to L1DSRAM using #pragma DATA_SECTION 3. Verify in .map file that array is piped to L1DSRAM section. 4. Write a function which reads the array in a loop and accumulate contents. Send the accumulated value as the return value of the funciton. 5. Profile the function. For measuring L2 performance, 1. Partition L1D and L2 as SRAM and cache. 2. Declare an array of lengh (say 1KB) and pipe it to L2SRAM using #pragma DATA_SECTION 3. Verify in .map file that array is piped to L2SRAM section. 4. Write a function which reads the array in a loop and accumulate contents. Send the accumulated value as the return value of the funciton. 5. Profile the function. For measuring DDR performance, 1. Partition L1D and L2 as SRAM and cahe. 2. Cache DDR space by setting appropriate MAR bits 3. Declare an array of lengh (say 1KB) and pipe it to L2SRAM using #pragma DATA_SECTION 4. Verify in .map file that array is piped to L2SRAM section. 5. Write a function which reads the array in a loop and accumulate contents. Send the accumulated value as the return value of the funciton. 6. Profile the function. Hope this shows the difference bettwen L1D, L2 and DDR performance. The performance can vary depending on L1D, L2 cache sizes and also if DDR cached or not. | |
Have you confirmed that you can properly read/write proper data in L1 and L2? The reason I ask is because access to these is dependent on the configuration. For example, at reset, the L1D is configured as full cache. The L2SRAM is configured as full RAM. So unless you change the configuration, L1D is not accessible as memory mapped RAM. If you are seeing same for L1D and L2, it might be because the L1D access is not actually reading/writing to the L1D RAM. Without seeing your code, I can only ask the question. I would also expect the L1D access to be faster than the L2 access. | |