Just seen on svn and... WOW!

Code:
+ Assembler implementation of mod/div.
    Improves amount of divides from about 230000/s to about 2400000/s on
    ARM920T, 200MHz.
I guess you've made these benchmarks on your brand-new GP2X

However, that's just great for NDS too, because only GBA uses bios funcs