Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Expression to ASM

  1. #1

    Expression to ASM

    hi there, this is not related to game development, .. i am writing a compiler, so far the compiler supports many features, but is missing 1 vital thing, the ability to accept expressions, i have tried to come up with a way to do this but i always run into problems, the language allows for writing in assembly and higher level in the same code scope so what i need is a way to output assembly code (my assembler will assemble the code afterwards)

    eax = myvar2 + (edx + ecx + (5 * edx)) - 5 xor myvar1;

    this sort of thing, i need a way to output equiv assembly opcodes/operations. can anyone help me with this problem? Thanks in advance
    Last edited by Colin; 16-05-2011 at 01:00 PM.

  2. #2
    This is an advanced topic. The best is to build a recursive analyzer. May be Crenshaw's paper helps you.
    No signature provided yet.

  3. #3
    PGD Staff code_glitch's Avatar
    Join Date
    Oct 2009
    Location
    UK (England, the bigger bit)
    Posts
    933
    Blog Entries
    45
    Hmm.. That is some very interesting stuff there Nuno... If I have time I might look into the possibility of a compiler designed to do graphics in very high speed or something to that effect. Ie. No bloat from libs as they are run at HW speed which would be nice indeed. Good luck on your compiler. If I was to evaluate expressions I would do something like:

    1) find the number of brackets
    2) for each bracket 'level' or sub brackets work those out and feed that back into the processor until the bracket level reaches 0
    3) In each process I would identify what is a number and what is not (ie numeric first char or otherwise) and sub in values where required.
    4) Process each part in BODMAS order (of if statements or similar)

    Don't know if it helps but if nicely implemented it might run well. I'm working on an interpreter for rapid prototyping and debugging of my APIs and programs so I'm getting some similar challenges. Hence the use of the document Nuno posted: take interpreter and feed into binary. Done.
    I once tried to change the world. But they wouldn't give me the source code. Damned evil cunning.

  4. #4
    One thing to take into account is operator priority. Let's say you have the expression
    1 + 2*4 + 8

    If you analyse from left to right, rirst you encounter 1, then '+', then 2. So you temporarily conclude that it is an addition of 1 and 2 :
    +(1, 2)

    Then you find '*' and 4, which have a higher priority, so multiplication is included into the '+' and combined with the 2 :

    +(1, *(2,4))

    Then you find '+' and 8, which has a lower priority, so it gets out of the multiplication. It has the same proprity as the first '+', and it is generally assumed that operations are made from left to right, so it gets out of the last addition too. Thus, the full expression is combined with the 8 :

    +( +(1, *(2,4)), 8 )

    Let's check :

    *(2,4) = 8
    +(1, 8 ) = 9
    +(9, 8 ) = 17

    Notice that to implement it this way, you need to remember the current position in the expression, which can be a value token, an operation token, or nothing if there is nothing parsed, or if you get to the global level with a low priority.

    A parenthesis is to be treated as a whole, until the corresponding end bracket is found. You can implement it recursively, using the same function to return the subexpression result, treated as a value token.

    Let's try with :
    1+ 2*(4+1) + 3

    tree : none
    current operation : nil
    potential left operand : nil

    input: + 2*(4+1) + 3

    tree : token(1)
    current operation : nil
    potential left operand : token(1)

    input: *(4+1) + 3

    => combine left and right operands

    tree : +(1,2)
    current operation : +(1,2)
    potential left operand : current operation

    => '*' has a higher prority, so get inside

    current operation : +(1,2)
    potential left operand : token(2)

    => after reading '*', do a recursive call that returns +(4,1)

    input : +3

    => combine left and right operands, which gives *(2, +(4,1))

    => replace token in current operation

    tree : +(1, *(2, +(4,1) ) )
    current operation : *(2, +(4,1))
    potential left operand : current operation

    => '+' has a lower priority and is from left to right so

    current operation : nil
    potential left operand : +(1, *(2, +(4,1) ) ) = tree

    => combine left and right operands

    input : none
    tree : +( +(1, *(2, +(4,1) ) ), 3)
    current operation : tree
    potential left operand : tree

    Note that the current operation may not be the whole tree, so you need to loop to search the parent.

    Then there are two cases :
    - either there is no current operation and one potential left operand : you can decide to raise an error or do a function call
    - or there is a current operation, which can have a result or not (an assignment has the lowest priority and has no result)

    To handle parameters of functions, you need to handle the parenthesis differently depending on the situation. If you have a potential left operand and have no operator, then it is a parameter list. If you've just read an operator, then it is not a parameter list.

    Hope this helps
    Last edited by circular; 17-05-2011 at 08:30 AM.

  5. #5
    hi there, thanks for replies, i have decided on first my compiler will convert the expression to an RPN (reverse polish notation), this will remove the problem of () and procedence of operators, my only problem now is the best way to output the opcodes, because i will be limited to registers since the programmer could also use them, maybe an option would be to use sse, but tbh i would rather not.

    btw the problem with recursion is if the programmer has a very large expression, it would use large stack space, and maybe even an overflow. as for the compiler, i am already far beyond the basics, the compiler already has many features and very usable, you can check it out if you like - http://www.codeziron.com
    Last edited by Colin; 17-05-2011 at 09:52 AM.

  6. #6
    good luck !
    Current (and lifetime) project: FAR Colony
    https://www.farcolony.com/

  7. #7
    Nice. Reverse Polish notation would also be my approach; It's really elegant.

    Can't wait to hear more from your compiler project.
    Coders rule nr 1: Face ur bugz.. dont cage them with code, kill'em with ur cursor.

  8. #8
    thanks farcodev,

    @chronozphere:
    i finished with the rpn converter, i have also came up with a way to assemble the RPN to machine code without it interupting the programmers code (altho i have not implemented this yet and will not for a while - since it needs alot of work), using 5 registers and some stack space for any other vars needed (not using push and pop), check the link back in other post if you want to download and check it out, it contains a massive changelog, i have not added much documentation yet, but what i have written is on the website under docuementation, there is also some samples posted, syntax and changes are open for sugestion and ideas too.

    At some point i wish to write a small 3d engine that can be suplied with the compiler, i will be writing a newton header and few other things i have in mind, overall the language is moving forward very good - i converted one of my c algorithms to ziron and it went down from 1048ns to 276ns (average), also one of my delphi apps went from 2000ms down to around 500ms (dont remember exactly this), basically the language adds much less garbage, and since most of it is just plain assembly language you get what you write, but even after i add more high level syntax, i plan to keep it adding just a minimum amount of code.

    the language has an inline macro system, class support, structs, dll, exe output (32bit for now only, i plan to add 64bit at a later date), anyways thanks for the interest and help.

  9. #9
    About expressions the hardest part (imo) is parsing and generating code for conditionals like "if(a || b && c)" etc where you need to generate compare-and-branch code. There are many good books on the topic of compiler construction, I found lots of good examples in this one.

    Generating native code with register allocation is another topic that is very advanced. There is the alternate path that Adobe has chosen which is to work with LLVM instead and let it generate high quality code for several different architectures.

    You should publish examples of the timings you talk about because it can gather interest to your compiler if it actually does perform better than a C-compiler. Frankly I doubt it but I hold my judgment until I see it

    Looks like an interesting project, good luck with it!
    ZGameEditor - Develop 64kb games for Windows.
    Thrust for Vectrex - ROM-file and 6809 source code.

  10. #10
    Wow .. again, this is really interesting. If I had more time, I could indulge into this field of expertise aswell.

    You are using a separate assembler right? Because writing plain IA-32 machine code is sheer hell. Also, do you allready use a linker or does it only generate a single module with all the right addresses directly in it? If this project goes any deeper, you could write a nice article about it. Any plans of showing us the source?

    Also your timings are very suprising to me. I also doubt that it will be significantly faster than C, but I am very curious about it's performance anyway (and the things you have to give up in order to squeeze more code into the same nanoseconds).
    Coders rule nr 1: Face ur bugz.. dont cage them with code, kill'em with ur cursor.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •