Gumberoo - Making a new learning language intepreter using FPC

**deathshadow** · 16-04-2012, 01:24 AM

Originally Posted by SilverWarior

I meand how often do you use nimbers greatear than 32 bit Integer in your games?

Since I'm planning on it being openGL aware, 32 bit floating point is likely the best minimum once rotations, gravity, momentum and drag are in the equation, much less the notion of resolution independent rendering. I considered single precision since that's what glFloat is guaranteed to be, but the max value limitation when dealing with non-integer values worried me and, well.. It plays to your second question about it:

Originally Posted by SilverWarior

Also won't using 64 bit integers considerably slow down all mathematical calculations when programs would be running on 32 bit operating systems.

The speed concern did worry me, but given that interpreted languages were fast enough by the 386 era to make simple sprite based games, I'm really not all that worried about the speed of floats when the minimum target is a 700mhz ARM 8 or a multi-ghz machine with SSE operations available.

... and remember, even a 8087 math-co could handle 80 bit "extended" at a decent speed even at clocks below 8mhz... I should know, I have one in my Tandy 1000SX next to the NEC V20 running at 7.16mhz.

Though that could just be I got used to targeting 4.77mhz on a 16 bit processor with a 8 bit data path that doesn't even have hardware sprites, I may be overestimating the capabilities of a 32 bit processor at almost 150 times that speed with blitting offloaded to the GPU.

I don't want typecasting to get in the way of actually using the language, and if that means a bit of slowdown, so be it. It's something that pissed me off back on things like Apple Integer Basic where you didn't even have fractions... With 4th graders as the target, it's complex enough without confusing them on the difference between ordinal and real... much less the dozen or so integer widths, or half dozen real types. By the fourth grade they should be learning long division, so explaining why they get 5/2=2 is something to avoid until they put on the big boy pants and move to something like C, Pascal or Python.

BASIC on the Coco only had 'numbers' -- and they were 32 bit numbers with 8 bits for the floating point on a 8 bit processor. (admittedly the semi-16 bit floating point monster 6809) -- and handled things just fine. If the 'slowdown' of 64 bit numbers when you have a math coprocessor DESIGNED to handle numbers that size is an issue, you're probably doing something wrong.

Or am I really off-base with that type of thinking?

I'm still thinking I might switch to 80 bit extended, though I'm going to test 32 bit single precision too; which is why I've defined my own "tReal" type so I can change it program-wide as needed.

Originally Posted by SilverWarior

Also wich type of string would you use (Ansi, UTF-8, Unicode)? Having only ansi support will make your rograming language les interesting for any programer coming from Easten Europe, Asia, or any other country whic uses some aditional characters wich are not present in Ansi charset.

I'm arguing with myself over that because to be honest, I hate complex character sets; they are a needlessly complex mess that until recently (relatively speaking) weren't even involved on computers. I often feel that if we just restricted ourselves to 7 bit ASCII we wouldn't have a lot of the headaches that crop up on websites... and since my bread and butter the past decade has been websites; it's only further made me HATE languages or even normal text that requires anything more complex than that.

Honestly, I'm tempted to restrict it to the character set used by a Apple IIe.

BUT -- you are right, such ethnocentric views could limit the potential audience; something I'd like to avoid. At the same time, I'd be far, far more worried about the overhead of string processing UTF-8 into a raster based font system like my GLKernedFont method; which will be what it's going to run since being cross platform I can't rely on any specific engine being present, and freetype looks like ass and kerns text like a sweetly retarded crack addict.

Also, converting even one complete UTF-8 font set to raster and making a kerning table for it doesn't rank all that high up on my to-do list either; Maybe if I got my auto-generator for the kerning table completed... THOUGH...

I could at least start adding the code hooks to add extended codepage support; that way other people who want it elsewhere could add that support themselves once I go public with the first release and codebase. That might be the best approach since I'm not planning on this being a one man operation forever... just until I hit Beta.

Partly to prove to myself I can still do this sort of thing. Past six years I've been retired due to failing health, and slowly weaning my clients off support... starting to feel like the mind is slipping, and this project isn't just about helping others, but also about proving something to myself.

Well, that and I have some really weird ideas on how an interpreter should work - and want to test said ideas without getting led off-track.

**SilverWarior** · 16-04-2012, 07:00 AM

Originally Posted by deathshadow

Since I'm planning on it being openGL aware, 32 bit floating point is likely the best minimum once rotations, gravity, momentum and drag are in the equation, much less the notion of resolution independent rendering. I considered single precision since that's what glFloat is guaranteed to be

OK I understand that when you are dealing with graphics and physics having 32 bit or 64 bit numbers could come in handy. But what about other cases. For instance if you are making some shooter game you wil define you units health with integer I presume. So what will be max health for your units? Several milions hitpoint? I don't think so. You wil probably define you units max heath in the range of few hundred hitpoints. So why do you need 64 bit integer for this again?
I know that having several different iteger types can be quite confusiong for beginers. Infact the concept of the integer itself could be confusing.
What I had in mind was support for 8 bit, 16 bit, 32 bit, etc numbers but the programing language itself is taking care about wich of theese is being used (self optimization)

Originally Posted by deathshadow

If the 'slowdown' of 64 bit numbers when you have a math coprocessor DESIGNED to handle numbers that size is an issue, you're probably doing something wrong.

I was refering of using 64 bit integers on 32 bit procesors (many platforms today still use 32 bit processing). Ofcourse you can stil do 64 bit math procesing on 32 bit procesors but for that you need to do two math calculation for one (first you calculate fir 32 bits, then you calculate nex 32 bits, and finaly you join result from these two functions into final 64 bit result). So making 64 bit calculations on 32 bit procesor takes atleast tvice as long.

Originally Posted by deathshadow

I'm arguing with myself over that because to be honest, I hate complex character sets; they are a needlessly complex mess that until recently (relatively speaking) weren't even involved on computers. I often feel that if we just restricted ourselves to 7 bit ASCII we wouldn't have a lot of the headaches that crop up on websites... and since my bread and butter the past decade has been websites; it's only further made me HATE languages or even normal text that requires anything more complex than that.

I do understand you point of wiew. More complex charsets do cause much more complexity. But how would you feel if you had a developing tool or programing language wich doesnt alow you to show some specific characters used in your language?

Originally Posted by deathshadow

I'd be far, far more worried about the overhead of string processing UTF-8 into a raster based font system like my GLKernedFont method; which will be what it's going to run since being cross platform I can't rely on any specific engine being present, and freetype looks like ass and kerns text like a sweetly retarded crack addict.

I don't seem how it would be difficult rendering true type fonts. I will learns shortly becouse I intend to make functions wich will alow me to do that in Aspyre graphic engine. If it will work well I do intend on sharing it source code.

Originally Posted by deathshadow

Well, that and I have some really weird ideas on how an interpreter should work - and want to test said ideas without getting led off-track.

Testing weird ideas isn't bad at all. Infact every invention was at firsth thought as a weird idea.

**deathshadow** · 16-04-2012, 09:43 AM

Originally Posted by SilverWarior

OK I understand that when you are dealing with graphics and physics having 32 bit or 64 bit numbers could come in handy. But what about other cases. For instance if you are making some shooter game you wil define you units health with integer I presume. So what will be max health for your units? Several milions hitpoint? I don't think so. You wil probably define you units max heath in the range of few hundred hitpoints. So why do you need 64 bit integer for this again?

Remember this is an interpreter, not a compiler, handling data types could add as much if not more overhead as well... php comes to mind with it's total lack of strict typecasting but at the same time internally having types; all those data conversions on every access/assignment can end up just as long as calling the FPU... being it ARM's VPF, legacy x87, or SSE.

Spending time on an interpreter, even after tokenizing on automatically selecting the optimal type for a value or range of values can end up just as big and slow as just operating on a single fixed type. All those branches, calls and conditionals add up quick... certainly faster than the 2 to 8 clock difference between a 32 bit integer operation on the cpu and 64 bit one on the FPU. (at least on x86... still learning ARM)

Originally Posted by SilverWarior

I was refering of using 64 bit integers on 32 bit procesors (many platforms today still use 32 bit processing). Ofcourse you can stil do 64 bit math procesing on 32 bit procesors but for that you need to do two math calculation for one (first you calculate fir 32 bits, then you calculate nex 32 bits, and finaly you join result from these two functions into final 64 bit result). So making 64 bit calculations on 32 bit procesor takes atleast tvice as long.

I get that, and in a way it's part of why I'm not bothering even having integer types.

By going straight to the math-co, on x87 that's simply a FLD, the operation (FMUL,FADD,FSUB,FDIV), then FSTP -- not really any more or less code than mov eax,mem; mov ebx,mem; mul ebx; mov mem,eax

At most a FPU double multiply (for example) on anything x87 pentium/newer is 12 bus clocks memory, 12 bus clocks code fetch and 6 cpu clocks execution (including setting up the FPU memory pointer)... A 32 bit integer multiply on same might be only 6 bus clocks memory, but it's 20 bus clocks code fetch and 4 cpu clocks.... so they may look like they take the same amount of time, but remember the bus isn't as fast as the cpu; as such on modern computers it is often FASTER to do a 64 bit floating point multiplication than it is a 32 bit integer one... just because 386 instructions are that extra byte in length meaning a longer wait for it to fetch.

Of course, if you can optimize the assembly to put everything into proper registers you can shift that back around, but that's more the type of thing for a compiler to do, not an interpreter.

... and while that's for the wintel world of doing things, you also have to remember that ARM lacks a integer divide; while the various VFP/VFE/SIMD/NEON whatever they want to optionally include this week do tend to provide it. Of course, there is the issue of not being able to rely on which FPU extensions are even available (if any) on ARM, and if FPC even bothers trying to include code for them -- that is a concern I'm going to have to play with in QEMU. I know the Cortex A8 provides NEON, which uses 64 bit registers despite being hooked to a 32 bit CPU.

After all, that's why SIMD and it's kine exist, and why the x87 was a big deal back in the day... since a 8087 was basically having a memory oriented 80 bit FPU sitting next to a 16 bit processor.

It is a good point though that I should 'check it'... I'll probably toss together a synthetic bench tomorrow to gauge the speed differences, if any... though constant looping will likely trigger the various caches, so I'll probably have to make a version that puts a few hundred k of NOP's in place to cache-flush between operations.

I'm also used to thinking x86 where the 'integer optimization' for coding hasn't really been true since pentium dropped... I've really got a lot of studying of ARM to do -- and the code FPC makes for ARM. I mean, does it even try to use SIMD/FPV if available?

Originally Posted by SilverWarior

I don't seem how it would be difficult rendering true type fonts. I will learns shortly becouse I intend to make functions wich will alow me to do that in Aspyre graphic engine. If it will work well I do intend on sharing it source code.

Wait until you try using the train wreck known as freetype -- it's rubbish, pure and simple... There's a reason so many SDL and OpenGL programs don't even bother and use raster fonts instead... The rendering is ugly, inconsistent, painfully slow and the code interfaces are the worst type of tripe this side of trying to write a device driver for linux.

I was thinking I could use monospace and/or monokerned fonts instead of true kerning; that would make it simpler/faster and since it's going to have an editor, it will have a monospaced fonts anyways. Vector fonts are a high resolution luxury that I don't think translate well to composite-scale resolutions in the first place; see the old vector fonts from the BGI at CGA resolutions.

I may also keep the editor strictly SDL, leaving openGL for when a program is running in the interpreter. Still need to play with the idea. Nowhere near working on that part of it yet as my first order of business is getting the tokenizer and bytecode interpreter complete to where it can at least make a console program. THEN I'll worry about the IDE, graphics, fonts, etc...

**SilverWarior** · 16-04-2012, 10:17 AM

Originally Posted by deathshadow

Remember this is an interpreter, not a compiler

Pardon me. It seems I misunderstood your intention. I got a feling that you intend to make completly new programing language from scratch.

BTW how come you have decided to not include support for pointers? I do understand that for beginner they might seem verry complicated, but when you learn working with them then you realize that they can be verry powerful if used corectly.

**deathshadow** · 16-04-2012, 12:14 PM

Originally Posted by SilverWarior

Pardon me. It seems I misunderstood your intention. I got a feling that you intend to make completly new programing language from scratch.

It is... but it's an interpreted one like PHP, Python, Perl, ROM Basic, Javascript, (and if you don't buy into this virtual machine BS, Java), etc -- instead of a compiled one. At this point being 'truly original' is a joke (given 90% of programming languages are just C rehashed -- see Java and PHP) but there are some ways to make things simpler.

A lot of the choices come from Python and Ruby without the parts to me are either needlessly complex, pointless, or annoying. (annoying? Elif much?!?).

Originally Posted by SilverWarior

BTW how come you have decided to not include support for pointers? I do understand that for beginner they might seem verry complicated, but when you learn working with them then you realize that they can be verry powerful if used corectly.

First, as you said it is a complicated concept; one I really wasn't able to grasp until Junior high and I was pretty up on the take for this sort of stuff. (given I'd written my first business app in diBol at 13)

But more importantly, given the other stuff I've omitted, what purpose would they serve? There's no direct memory access, no complex data structures like record or userland objects, arrays are (hopefully) going to automatically be dynamic (since I'll be using "array of" with setlength on the back end)... Pointers for the sake of having pointers is... pointless.

Plenty of real world deployment languages get along just fine without them. PHP doesn't have pointers... Java doesn't have them; they have handles which will work much akin to my 'createxxxxx' methods...

They're an advanced concept best left to an intermediate or higher language, not a elementary one.

Though I've been beating myself up over including/not including all sorts of things like pointers the past week and a half since I started this project on a cocktail napkin during a "brainstorming luncheon". I keep having to remind myself "don't make it any more complicated than a 1980's ROM Basic, just let it do more"

Which means no scope, no pointers, no user functions, no typecasting of numerics, no complex data structures.

**paul_nicholls** · 16-04-2012, 12:57 PM

Don't beat yourself up, stick to your guns!

**code_glitch** · 16-04-2012, 03:24 PM

I have to be honest though. Pointers are somewhat of a double edged sword, used correctly they can create some absolutely marvelous and innovative code, especiall in an OOP context. However, the down side is that they can also create an absolute hell of debugging as in my case, I often find myself referring to 'runlevels'. In other words, an array of procedures, sometimes 2D, which the program can modify on the fly to change its behaviour when conditions are met. However, due to the generic context of many of these procedure the error with a memory address is not so useful - its not the actual procedure that was at fault as the code there was working just a minute ago - but rather, a pointer has moved to the wrong procedure/variable and fed the wrong array into the wrong procedure or something.

I guess what I'm getting at, is that although GDB can point to your code being faulty, pointers are more of a higher level 'logic' problem. I can definitely see why its a good idea to keep people whom are new to code away from such nasties, however I would recommend something similar to pointers as when they eventually become more adept at programming, they will inevitably want to create more complex programs and experiment with that concept - something all to present in the real world. Perhaps implementing something similar to pointers, but under a more structured context would be a good idea in that case?
Just my 2 cents... Great work anyhow.

Thread: Gumberoo - Making a new learning language intepreter using FPC

Thread Tools

Display

Hybrid View

Bookmarks

Bookmarks

Posting Permissions