I am debugging a code for a cryptographic implementation on a Tricore TC275 from Infineon (reference assembly language).
PMI_PSPR (wx!p): org = 0xC0000000, len = 24K /*Scratch-Pad RAM (PSPR)*/
DMI_DSPR (w!xp): org = 0xD0000000, len = 112K /*Local Data RAM (DSPR)*/
The stack pointer a[10] always points to a reserved memory area after a call to the mac function.
###### typedefs ######
typedef uint16_t limb_t;
typedef limb_t gf_t[DIGITS]; //DIGITS=312
typedef int32_t dslimb_t;
################################
/**Multiply and accumulate c += a*b*/
void mac(gf_t c, const gf_t a, const gf_t b)
1: 0xC0000812: D9 AA 40 9F LEA a10,[a10]-0x9C0 //Load eff. addr.
/*Reference non-Karatsuba MAC */
dslimb_t accum[2*DIGITS] = {0};
2: 0xC0000816: 40 A2 MOV.AA a2,a10
3: 0xC0000818: D2 02 MOV e2,0x0 //move 0x0 to d2 and d3
4: 0xC000081A: C5 03 37 40 LEA a3,0x137 // 0.5*length of accum
5: 0xC000081E: 89 22 48 01 ST.D [a2+]0x8,e2 //<= fails here
6: 0xC0000822: FC 3E LOOP a3,0xC000081E
7: 0xC0000824: 40 AF MOV.AA a15,a10
###contents of relevant registers###
before after
1: a[10] D000 0600 CFFF FC40 (not definend in memory map?)
2: a[2] D000 0A06 CFFF FC40
3: d[2] 0000 0002 0000 0000
3: d[3] 0000 0000 0000 0000 (would have been set to zero too)
4: a[3] 0000 0186 0000 0137 (#of iterations in loop)
5: a[2] CFFF FC40 (store failed here)
value@CFFF FC40 ???? ???? ???? ???? (write is not allowed I guess)
0x9C0 = 2496 (base10)
and the length of the array accum is 624
, each element containing an int32_t
. Thus 624*4 = 2496 Bytes
get allocated or what?
But at this address in memory, no writes are allowed as far as I understand the memory map which is given to the linker... But the generated assembly code tries to do in line 5?
Does anybody know what I might be doing wrong here? I also tried to use calloc to allocate memory on the heap (instead of the stack like the code above does right?) but the programm still crashed.
I also copied the line dslimb_t accum[2*DIGITS] = {0}
to the start of the program where it was executed without an error.
Thank you very much for any help!
EDIT
mac is called like that, uniform samples some uniform random numbers
gf_t sk_expanded[DIM],b,c;
for (unsigned i=0; i<DIM; i++) {
noise(sk_expanded[i],ctx,i);
}
for (unsigned i=0; i<DIM; i++) {
noise(c,ctx,i+DIM); //noisy elements in c after call
for (unsigned j=0; j<DIM; j++) {
uniform(b,pk,i+DIM*j); //uniform random numbers in b after call
mac(c,b,sk_expanded[j]); //fails here on first call
}
contract(&pk[MATRIX_SEED_BYTES+i*GF_BYTES], c);
}
this code runs on my host machine, but on my tricore microcontroller it fails in the first mac() function call.