Originally Posted by
dmantione
The save into r8 should be a proper save, unless the compiler expects you to save r8 which you destroy. Hmmm.... I'm going need take a look at the code myself, you can send me the code if you wish so... If no time tomorrow though.
In the meantime, please compare a version with the "mod" defined as function in the program and when it is in the RTL. Can you see differences?
Looking at the compiler generated .s files I can find some small differences, mainly dues to different implementation, I think.
This comes from function version:
Code:
# [41] c := modulus((i * j), 31);
mov r1,#31
# Register r0 allocated
mov r0,r4
# Register r2 allocated
mov r2,r5
# Register r0,r2 released
# Register r0 allocated
mul r0,r2,r0
# Register r2,r3,r12,r13,r14,r15 allocated
bl P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT
# Register r1,r2,r3,r12,r13,r14,r15,r0 released
mov r6,r0
# Register r0 allocated
# [43] VideoBuffer[j + 240 * i] := c;
and this other one comes from rtl version:
Code:
# [39] c := (i * j) mod 31;
mov r0,r4
# Register r2 allocated
mov r2,r5
# Register r0,r2 released
# Register r1 allocated
mul r1,r2,r0
# Register r0 allocated
mov r0,#31
# Register r2,r3,r12,r13,r14,r15 allocated
bl fpc_mod_longint
# Register r1,r2,r3,r12,r13,r14,r15,r0 released
mov r6,r0
# Register r0 allocated
# [43] VideoBuffer[j + 240 * i] := c;
This is the 'modulus' function:
Code:
.globl P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT
P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT:
# Temps allocated between r11-44 and r11-44
# Register r13,r11,r12 allocated
mov r12,r13
stmfd r13!,{r11,r12,r14,r15}
sub r11,r12,#4
# Register r12 released
sub r13,r13,#44
# Var number located in register
# Var denom located in register
# Temp -44,4 allocated
# Var $result located at r11-44
# Register r0,r1,r2,r3,r12,r13,r14,r15 allocated
# [27] swi #0x060000
swi #393216
# [28] mov r0, r1
mov r0,r1
# Register r0,r1,r2,r3,r12,r13,r14,r15 released
# Temp -44,4 released
ldmea r11,{r11,r13,r15}
# Register r0 released
.Le0:
.size P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT, .Le0 -
P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT
It does some "strange" things with r11, r12 and r13 that i can't understand (maybe something related to stack, given that r13 is the stack pointer?).
I can zip all rtl sources and this small example and send it to your mailbox, if you want.
Bookmarks