Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Benchmarks! FPC vs Delphi vs C++

  1. #1

    Benchmarks! FPC vs Delphi vs C++

    Use
    http://babelfish.altavista.com/babelfish/tr
    to translate this forum page
    http://www.gamedev.ru/flame/forum/?id=78283&page=22
    from Russian - and you'll find some interesting stuff (you can also download the sources and binaries there).

    There's an ongoing benchmark with building a mandelbrot fractal image in 3000x4000 pixels.

    On my machine the results are (miliseconds):

    Code:
    Intel Dual Core -- AMD Sempron 2400, 1.6 GHz both. 
    
    The test time in miliseconds.
    
    MSVC8 (single, sse) - 3100 -- crashed
    turbo delphi-double   7150 -- 8280
    turbo delphi-single - 5400 -- 5157
    fpc-double           12050 -- 8734
    fpc-double-sse2 -  -  4800 -- crashed
    fpc-single            4970 -- 4625
    fpc-single-sse2 -  -  4460 -- 4875
    All versions are required to save their fractal as a bmp, to avoid mistakes.

    At least shows that
    1) it's unwise to use the Double type in the game development.
    2) Free Pascal 2.2.0 has the best code optimizer among all things Pascal

  2. #2
    farcodev_
    Guest

    Benchmarks! FPC vs Delphi vs C++

    interesting, ill use single only in my project now :shock:

  3. #3

    Benchmarks! FPC vs Delphi vs C++

    uhm, double type float point operations are not very well optimized :?
    From brazil (:

    Pascal pownz!

  4. #4

    Benchmarks! FPC vs Delphi vs C++

    To be honest, I was using Single type from the very beginning.

  5. #5

    Benchmarks! FPC vs Delphi vs C++

    This topic is still alive.

    I've downloaded some early tests (100x100, double). Results (on P3 celeron):
    Cpp - ~1100ms
    D7 - ~1700ms
    FPC 2.2.0 - same as D7.
    On C2 Duo the results are simply faster for all the compilers.

    Optimization flag in options doesn't affect performance directly. How to declare Deep and Scale variables - as constants or as vars doesn't affect as well. But the order of declaration is important (aligment?).

    FPC's performance is same as Delphi's one with Double type and slightly higher with Single type. This is a good news because earlier versions of FPC were slower than Delphi.

    What I think about it:
    This test consists of FPU computations only. No random memory access (you can comment out writing to the array with almost no boost, all is in cache), no API calls. Pure computations. Not a real case (unless you are writing a physics engine).
    Delphi does not optimize FPU code at all (FWAITs, carring about FPU exceptions, use of instructions like SAHF (slow), etc.).
    On the other hand this case is most easy for a code optimizer (where it present).
    Nevertheless slowdown is about 50-60% which is not critical for such an ideal case.

    I see later benchmarks have some result testing means.
    I used more accurate test:
    Code:
    Digest := 0;
      for dy := 0 to height -1 do for dx := 0 to width - 1 do Digest := Digest + pix[dy, dx];
    Digest should be the same for all versions. I'll not be suprised if some very optimized versions will give wrong results or loose performance because computation results are used.
    UPD: Digest is wrong for FPC with "OG2" or "OG3" options.
    But it's good also because this test can be a bug report.

  6. #6

    Benchmarks! FPC vs Delphi vs C++

    I've downloaded some early tests (100x100, double).
    These were incorrect, use the new ones below (save their fractal as a bmp to the drive c:\ root)

    And don't forget: the {$fputype sse2} directive can do wonders to your code speed!

    The full battery of benchmarks, including sources, here (500K):
    http://217.70.20.10/_share/_004/fpc_...benchmarks.zip

    The benchmark source (fpc):

    [pascal]program fpctest1;

    {$apptype console}
    {$mode objfpc}
    {$asmmode intel}
    {$fputype sse2}

    uses
    SysUtils, Windows;

    const height = 4000;
    width = 3000;
    scale = 0.0008;
    deep = 100;

    type
    float = double;
    //float = single;

    var pix: array [0..height-1,0..width-1] of longint;
    time: longint;
    f : file;
    fh : TBitmapFileHeader;
    bh : TBitmapInfoHeader;


    procedure build_fractal(scale: float; deep: longint);
    var
    color, dx, dy : Integer;
    cx, cy, zx, zy, zxt : float;
    begin
    cy := (height div 2) * scale;
    for dy := height -1 downto 0 do
    begin
    cy := cy - scale;
    cx := (width div 2) * scale;
    for dx := width - 1 downto 0 do
    begin
    color := 0;
    // Calculate color
    cx := cx - scale;
    zx := cx;
    zy := cy;
    while zx * zx + zy * zy < 4 do
    begin
    zxt := zx * zx - zy * zy + cx;
    zy := 2 * zx * zy + cy;
    zx := zxt;
    inc(color);
    if color > deep then break;
    end;
    pix[dy, dx] := 4 * color;
    end;
    end;
    end;

    var
    cw: word;

    begin
    cw:= $033F; //?ê¬??ê¬? ?ê¬??ê¬? ?ë‚Ä°?ë‚Äö?ê¬æ ?ê¬??ꬵ ?ê¬??ꬪ?ê¬??ë¬è?ꬵ?ë‚Äö ?ë‚Äö?ê¬??ê¬??ê¬?
    asm
    fldcw [cw]
    end;
    FillChar(pix, sizeof(pix), 0);
    time := GetTickCount();

    build_fractal( scale, deep);

    time:= GetTickCount() - time;
    WriteLn( time );

    fh.bfType := WORD('B') + WORD('M') shl 8;
    fh.bfSize := SizeOf(TBitmapFileHeader);
    fh.bfReserved1 := 0;
    fh.bfReserved2 := 0;
    fh.bfOffBits := fh.bfSize + SizeOf(TBitmapInfoHeader);

    FillChar(bh, SizeOf(TBitmapInfoHeader), 0);
    bh.biSize := SizeOf(TBitmapInfoHeader);
    bh.biWidth := width;
    bh.biHeight := height;
    bh.biPlanes := 1;
    bh.biBitCount := 32;

    Assign(f, 'c:\' + ChangeFileExt(ExtractFileName(ParamStr(0)), '.bmp'));
    Rewrite(f, 1);
    BlockWrite(f, fh, SizeOf(TBitmapFileHeader));
    BlockWrite(f, bh, SizeOf(TBitmapInfoHeader));
    BlockWrite(f, pix, width * height * 4);
    Close(f); // *
    ReadLn;
    end.[/pascal]

  7. #7

    Benchmarks! FPC vs Delphi vs C++

    Speeds with an AMD Athalon 4800 X2 (From your download, Chebs):
    Delphi - 3782
    Turbo Delphi, Single - 3609
    Turbo Delphi, Double - 5422
    MSVC Single SSE - 2156
    FPC Double - 6047
    FPC Double SSE2 - 2922 (!!!)
    FPC Single - 3234
    FPC Single SSE - 3515 (??)

    It looks to me that the overall best speeds go to the SSE2 optimized code, but especially the Double code for SSE2. Kind of surprising, the boost for doubles, but pleasing. Only 750ms behind C++.

    My only question is that if you want to enable single/double optimization by SSE2, how to you guarantee that the program will still run on a system without SSE2? I'm thinking that you'd need a whole new executable for that compiled without SSE2 optimizations.

  8. #8

    Benchmarks! FPC vs Delphi vs C++

    Quote Originally Posted by Brainer
    To be honest, I was using Single type from the very beginning.
    Not when you call functions that pass float parameters as "extended"...
    This is my game project - Top Down City:
    http://www.pascalgamedevelopment.com...y-Topic-Reboot

    My OpenAL audio wrapper with Intelligent Source Manager to use unlimited:
    http://www.pascalgamedevelopment.com...source+manager

  9. #9

    Benchmarks! FPC vs Delphi vs C++

    FPC Single - 3234
    FPC Single SSE - 3515 (??)
    It seems that AMD has better general FPU than SSE1/2. It was theoretized that Athlon's FPU is a part of their own 3dNow! technology while SSE is a child of Intel, and thus Athlon may do SSE-optimized code slower than the one that uses FPU only.

    The same sityation was for my Sempron 1.6GHz (in fact just an older Athlon XP)

  10. #10

    Benchmarks! FPC vs Delphi vs C++

    Ah! Good to know, thanks Chebmaster.

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •