Results 1 to 10 of 39

Thread: OpenGL GLSL - Text rendering query

Threaded View

Previous Post Previous Post   Next Post Next Post
  1. #21
    PGD Staff / News Reporter phibermon's Avatar
    Join Date
    Sep 2009
    Location
    England
    Posts
    524
    glFlush and glFinish are the correct calls to make to handle the GL command buffer. The swap operation essentially calls these, schedules the swap on v-sync and then blocks until that's complete.

    GL commands might be buffered across a network, or to a display server (think remote GL on X11) or in the GL implementation you're using (mesa, nvidia drivers etc)

    To correctly handle a v-synced GL application you should be setting up your frame so you spend the minimum amount of time possible on the blocking swap call.

    You can use GL timers to get the 'server' side timing, calculate the command buffer overhead by comparing that against your 'client' side GL calls followed by flush and finish and then you know that you can call swap and meet the v-sync 'deadline' with the minimum amount of time blocked in the swap call.

    If you stream resources to graphics memory? such a setup is crucial for getting the maximum bandwidth without dropping frames.

    As far as legacy GL is concerned, or to give it the correct name : "immediate mode"? the clue is in the name. GL3+ (in the context of performance) is all about batching - minimizing the API calls and efficiently aligning data boundaries.

    It's not about fixed function or programmable hardware - it's about how the data is batched and sent to that hardware, regardless of the actual functionality.

    Display lists in older GL versions were a herald of things to come - they allow you to bunch together all your commands and allow the GL implementation to store the entire display list in whatever native format it pipes to the actual hardware, effectively storing the commands 'server' side so they don't have to be sent over the pipe every time. How they are handled is vendor specific, some display list implementations are no faster than immediate calls (some old ATI + intel drivers long ago were known for it)

    So yes - IMMEDIATE mode GL, that is, versions before 3.x which are referred to here as 'legacy' - will always be slower than the modern APIs - it's not about vendor implementation regardless of how optimal or not their code is.

    Other than display lists and any driver specific optimizations the vendor may of made - there's no 'batching' in legacy GL. This is why we have VBOs, instancing - all server side optimizations - this is why Vulkan and newer APIs are capable of better performance in various situations.

    The bottleneck is call overhead, 'instruction latency and bandwidth' - call it what you will. It's not what you move or what is done with it when it gets there - it's how you move it.

    GL2 is moving a ton of dirt with a teaspoon. GL3+ is moving a ton of dirt with a pneumatic digger. (Vulkan is moving a ton of dirt with a bespoke digger that has been built right next to the pile of dirt)

    If you're only moving a teaspoon of dirt? use a teaspoon, it's easier to use than a digger - but don't defend your teaspoon when you need to move a ton of dirt.

    Or something like that. Something something teaspoon. I think we can all agree : teaspoon.
    Last edited by phibermon; 23-01-2018 at 05:55 PM.
    When the moon hits your eye like a big pizza pie - that's an extinction level impact event.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •