Quote Originally Posted by dmantione
The save into r8 should be a proper save, unless the compiler expects you to save r8 which you destroy. Hmmm.... I'm going need take a look at the code myself, you can send me the code if you wish so... If no time tomorrow though.

In the meantime, please compare a version with the "mod" defined as function in the program and when it is in the RTL. Can you see differences?
Looking at the compiler generated .s files I can find some small differences, mainly dues to different implementation, I think.
This comes from function version:
Code:
# [41] c := modulus((i * j), 31);
	mov	r1,#31
	# Register r0 allocated
	mov	r0,r4
	# Register r2 allocated
	mov	r2,r5
	# Register r0,r2 released
	# Register r0 allocated
	mul	r0,r2,r0
	# Register r2,r3,r12,r13,r14,r15 allocated
	bl	P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT
	# Register r1,r2,r3,r12,r13,r14,r15,r0 released
	mov	r6,r0
	# Register r0 allocated
# [43] VideoBuffer[j + 240 * i] := c;
and this other one comes from rtl version:
Code:
# [39] c := (i * j) mod 31;
	mov	r0,r4
	# Register r2 allocated
	mov	r2,r5
	# Register r0,r2 released
	# Register r1 allocated
	mul	r1,r2,r0
	# Register r0 allocated
	mov	r0,#31
	# Register r2,r3,r12,r13,r14,r15 allocated
	bl	fpc_mod_longint
	# Register r1,r2,r3,r12,r13,r14,r15,r0 released
	mov	r6,r0
	# Register r0 allocated
# [43] VideoBuffer[j + 240 * i] := c;
This is the 'modulus' function:
Code:
.globl	P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT
P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT:
# Temps allocated between r11-44 and r11-44
	# Register r13,r11,r12 allocated
	mov	r12,r13
	stmfd	r13!,{r11,r12,r14,r15}
	sub	r11,r12,#4
	# Register r12 released
	sub	r13,r13,#44
# Var number located in register
# Var denom located in register
# Temp -44,4 allocated
# Var $result located at r11-44
	# Register r0,r1,r2,r3,r12,r13,r14,r15 allocated
# [27] swi #0x060000
	swi	#393216
# [28] mov r0, r1
	mov	r0,r1
	# Register r0,r1,r2,r3,r12,r13,r14,r15 released
# Temp -44,4 released
	ldmea	r11,{r11,r13,r15}
	# Register r0 released
.Le0:
	.size	P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT, .Le0 - 
P$FILLSCREEN_FPC_MODULUS$LONGINT$LONGINT$$LONGINT
It does some "strange" things with r11, r12 and r13 that i can't understand (maybe something related to stack, given that r13 is the stack pointer?).

I can zip all rtl sources and this small example and send it to your mailbox, if you want.