Results 1 to 10 of 10

Thread: Optimizing in Free Pascal.

  1. #1

    Optimizing in Free Pascal.

    Now that the we have fixed the last known bug of Allegro ( ) I wonder if I can optimize it before next release.

    First I've read that INLINE keyword only has effect in the unit it's used. That means I can't use it to avoid the extra-call at all. I that true? Most functions and procedures are just wrappers around the actual call. For example, the sprite drawing procedure:
    Code:
    PROCEDURE al_draw_sprite_ex (bmp, sprite: AL_BITMAPptr; x, y, mode, flip: LONGINT);
    BEGIN
     bmp^.vtable^.draw_sprite_ex (bmp, sprite, x, y, mode, flip);
    END;
    Another group of procedures uses wrappers to make it more Pascal-like and/or use Pascal data types:
    Code:
    (* Function for messages. *)
     PROCEDURE _allegro_message (CONST msg: PCHAR); CDECL;
      EXTERNAL ALLEGRO_SHARED_LIBRARY_NAME NAME 'allegro_message';
    
    (* Outputs a message. *)
     PROCEDURE al_message (CONST msg: STRING);
     BEGIN
      _allegro_message (PCHAR (msg));
     END;
    I think that just "inlining" that calls (and there are a lot of them) performance should rise up a lot because almost all sprite and polygon drawing are like the previous ones. Actually they're implemented as macros in C (you know, " #define ...").

    Other question. How can I profile FPC programs? I did use gprof some years ago but I don't remember how does it work. Can I use it with FPC?
    No signature provided yet.

  2. #2

    Re: Optimizing in Free Pascal.

    I've been looking for a good Profiler for FPC for a while now to little or no avail. You can use Valgrind on Linux (http://lazarusroad.blogspot.com/2007...ofile-fpc.html shows a start) but for Windows there really isn't an answer. Lots of people have talked about porting DelphiTools' Sampling Profiler (http://delphitools.info/samplingprofiler/) but I've not seen a single one complete the task (don't know how hard it would really be).

    This post (http://www.freepascal.org/docs-html/user/userse56.html) seems to allude to using gprof with the --pg compiler flag, but again this seems to be a Linux only solution.

    I'll be interested to see if anyone else finds anything beyond this

    UPDATE:
    Looks like built in profiling is broke http://wiki.freepascal.org/Profiling...ofiler_support except in trunk 251

    - Jeremy

  3. #3

    Re: Optimizing in Free Pascal.

    Quote Originally Posted by ?ëu?±o Mart??nez
    I think that just "inlining" that calls (and there are a lot of them) performance should rise up a lot because almost all sprite and polygon drawing are like the previous ones. Actually they're implemented as macros in C (you know, " #define ...").
    How much is "a lot"? I can hardly imagine optimizing this would give any noticable performance increase.

    But i try to avoid wrapping single functions as much as possible. If i can call it directly from unit like OpenGL header i do that.

  4. #4

    Re: Optimizing in Free Pascal.

    Quote Originally Posted by jdarling
    UPDATE:
    Looks like built in profiling is broke http://wiki.freepascal.org/Profiling...ofiler_support except in trunk 251
    it says it's broken only for gprof. I'll try with Valgrind.

    Thanks form the suggestion.

    Quote Originally Posted by User137
    How much is "a lot"? I can hardly imagine optimizing this would give any noticable performance increase.
    "a lot" is "more than a little"... or something. Actually I don't know how much, but it should in some cases. Any game should draw/blit more than 100 bitmaps/frame so currently it does 200 calls (with parameter passing). Using INLINE it would be reduced to 100 calls. Add calls for input testing (mouse, joystick...) and sound. Yesterday I was rewriting a voxel renderer I wrote some years ago in C. It does test several thousands of voxels each frame, 30 frames per second. That are a lot of calls.

    Quote Originally Posted by User137
    But i try to avoid wrapping single functions as much as possible. If i can call it directly from unit like OpenGL header i do that.
    My fault. I tried to make the API Pascal-like and use Pascal types (i.e. STRING, ARRAY OF SOMETHING) instead of the C ones (pointers, pointers, and more pointers...) whih will force to use typecasting in a lot of calls.
    No signature provided yet.

  5. #5

    Re: Optimizing in Free Pascal.

    Inline those, not only faster, but smaller binary
    From brazil (:

    Pascal pownz!

  6. #6
    PGDCE Developer de_jean_7777's Avatar
    Join Date
    Nov 2006
    Location
    Bosnia and Herzegovina (Herzegovina)
    Posts
    287

    Re: Optimizing in Free Pascal.

    Quote Originally Posted by arthurprs
    Inline those, not only faster, but smaller binary
    While the above statement might be true, this is usually wrong. The entire code of an inline routine is placed wherever the routine is called, and if you have sufficiently complex routines, this results in bigger code, not smaller.

    Inline routines are inline throughout the entire program, not only in the units they're contained within, this is at least true for FPC (version 2.0.2 or greater). However, this depends on the compiler, which may decide that some routines cannot be inlined and therefore will be executed as a regular routine call.

    Depending on the nature of a routine, a inline routine may be up to 2x faster than a non-inline routine (because the overhead of calling the routine is non-existent). This can be verified by a simple check, write a inline routine in a unit that performs some mathematical operation (e.g. normalization of a vector) and call it in a program a lot of times (1,000,000 or more). Measure times when you call a inlined and a non-inlined version of the routine. The differences can be seen.
    Existence is pain

  7. #7

    Re: Optimizing in Free Pascal.

    Quote Originally Posted by de_jean_7777
    Quote Originally Posted by arthurprs
    Inline those, not only faster, but smaller binary
    While the above statement might be true, this is usually wrong. The entire code of an inline routine is placed wherever the routine is called, and if you have sufficiently complex routines, this results in bigger code, not smaller.

    Inline routines are inline throughout the entire program, not only in the units they're contained within, this is at least true for FPC (version 2.0.2 or greater). However, this depends on the compiler, which may decide that some routines cannot be inlined and therefore will be executed as a regular routine call.

    Depending on the nature of a routine, a inline routine may be up to 2x faster than a non-inline routine (because the overhead of calling the routine is non-existent). This can be verified by a simple check, write a inline routine in a unit that performs some mathematical operation (e.g. normalization of a vector) and call it in a program a lot of times (1,000,000 or more). Measure times when you call a inlined and a non-inlined version of the routine. The differences can be seen.
    In this case it probably save a few bytes, instead of calling the warper and then call the target, just call the target.

    You should inline those very small functions that are called a lot of times.
    From brazil (:

    Pascal pownz!

  8. #8

    Re: Optimizing in Free Pascal.

    After adding "INLINE" to a lot of procedures and functions, I did some tests compiling with and without the "-Si" option, but I can't see almost difference. Not sure if the compiler is deciding my code can't be "inlined".
    No signature provided yet.

  9. #9

    Re: Optimizing in Free Pascal.

    I don't know if it works for freepascal and/or lazarus (probably not), but there is a free Delphi profiler program you can find here:

    http://delphitools.info/

    cheers,
    Paul

  10. #10
    PGD Staff / News Reporter phibermon's Avatar
    Join Date
    Sep 2009
    Location
    England
    Posts
    524

    Re: Optimizing in Free Pascal.

    I'm guilty of not profiling my code, I just do my best, write each new processor intensive task in a seperate test app, set some arbitary but future fixed usage pattern for the test and get it running as quick as I can before I get fed up optimizing.

    That way if I ever decide that I need to optimize some more I can just go back to any suspiciously expensive test app and poke and prod it a bit more.

    obviously this won't work for any task that has multiple steps that can't be divide out into seperate test apps due to mutal dependance and it's not a true test for a typical usage pattern in the system as a whole..

    But it works for me and encourages a good modular design.

    edit : I'm informed that this technique is similar to extreme programming (http://www.extremeprogramming.org/).

    An interesting idea.
    When the moon hits your eye like a big pizza pie - that's an extinction level impact event.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •