Remember this is an interpreter, not a compiler, handling data types could add as much if not more overhead as well... php comes to mind with it's total lack of strict typecasting but at the same time internally having types; all those data conversions on every access/assignment can end up just as long as calling the FPU... being it ARM's VPF, legacy x87, or SSE.
Spending time on an interpreter, even after tokenizing on automatically selecting the optimal type for a value or range of values can end up just as big and slow as just operating on a single fixed type. All those branches, calls and conditionals add up quick... certainly faster than the 2 to 8 clock difference between a 32 bit integer operation on the cpu and 64 bit one on the FPU. (at least on x86... still learning ARM)
I get that, and in a way it's part of why I'm not bothering even having integer types.
By going straight to the math-co, on x87 that's simply a FLD, the operation (FMUL,FADD,FSUB,FDIV), then FSTP -- not really any more or less code than mov eax,mem; mov ebx,mem; mul ebx; mov mem,eax
At most a FPU double multiply (for example) on anything x87 pentium/newer is 12 bus clocks memory, 12 bus clocks code fetch and 6 cpu clocks execution (including setting up the FPU memory pointer)... A 32 bit integer multiply on same might be only 6 bus clocks memory, but it's 20 bus clocks code fetch and 4 cpu clocks.... so they may look like they take the same amount of time, but remember the bus isn't as fast as the cpu; as such on modern computers it is often FASTER to do a 64 bit floating point multiplication than it is a 32 bit integer one... just because 386 instructions are that extra byte in length meaning a longer wait for it to fetch.
Of course, if you can optimize the assembly to put everything into proper registers you can shift that back around, but that's more the type of thing for a compiler to do, not an interpreter.
... and while that's for the wintel world of doing things, you also have to remember that ARM lacks a integer divide; while the various VFP/VFE/SIMD/NEON whatever they want to optionally include this week do tend to provide it. Of course, there is the issue of not being able to rely on which FPU extensions are even available (if any) on ARM, and if FPC even bothers trying to include code for them -- that is a concern I'm going to have to play with in QEMU. I know the Cortex A8 provides NEON, which uses 64 bit registers despite being hooked to a 32 bit CPU.
After all, that's why SIMD and it's kine exist, and why the x87 was a big deal back in the day... since a 8087 was basically having a memory oriented 80 bit FPU sitting next to a 16 bit processor.
It is a good point though that I should 'check it'... I'll probably toss together a synthetic bench tomorrow to gauge the speed differences, if any... though constant looping will likely trigger the various caches, so I'll probably have to make a version that puts a few hundred k of NOP's in place to cache-flush between operations.
I'm also used to thinking x86 where the 'integer optimization' for coding hasn't really been true since pentium dropped... I've really got a lot of studying of ARM to do -- and the code FPC makes for ARM. I mean, does it even try to use SIMD/FPV if available?
Wait until you try using the train wreck known as freetype -- it's rubbish, pure and simple... There's a reason so many SDL and OpenGL programs don't even bother and use raster fonts instead... The rendering is ugly, inconsistent, painfully slow and the code interfaces are the worst type of tripe this side of trying to write a device driver for linux.
I was thinking I could use monospace and/or monokerned fonts instead of true kerning; that would make it simpler/faster and since it's going to have an editor, it will have a monospaced fonts anyways. Vector fonts are a high resolution luxury that I don't think translate well to composite-scale resolutions in the first place; see the old vector fonts from the BGI at CGA resolutions.
I may also keep the editor strictly SDL, leaving openGL for when a program is running in the interpreter. Still need to play with the idea. Nowhere near working on that part of it yet as my first order of business is getting the tokenizer and bytecode interpreter complete to where it can at least make a console program. THEN I'll worry about the IDE, graphics, fonts, etc...
Bookmarks