Quote Originally Posted by Chebmaster View Post
I have not, but Open Arena takes up to 60% of my gaming time.
That they do, and I believe glBegin is there to stay forever, BUT with a nasty catch: these technologies are only kept good enough to keep old games running. There is no need to make them efficient.
I would advocate it slightly different: yes, they (Nvidia, AMD, etc.) need to keep old API interface and for that, they are likely providing a fixed-function pipeline wrapper on top of programmable pipeline. However, since they know their own architecture very well, it is very likely they are using most optimal approach to implement such wrapper. In contrast, when you jump directly to programmable pipeline and try to build an engine of your own, it is much more difficult for you to achieve at least *the same* level of performance as the legacy fixed-function pipeline built as wrapper, because you have to optimize it to many architectures, vendors, drivers, OS versions, etc.

Granted, if you use programmable pipeline properly, you can do much more than you could do with FFP: in a new framework that we're in process of publishing, you can use Compute shaders to produce a 3D volume surface, which is then processed by Geometry/Tessellation shaders - the whole beautiful 3D scene is built and rendered without sending a single vertex to GPU! Nevertheless, it doesn't mean you can't start learning with FFP, it is just as good as any other solution when what you need is to render something simple in 2D or 3D using GPU.

Besides, you never know, maybe some day the whole OpenGL will be made as a wrapper on top of Vulkan API, who knows...

Quote Originally Posted by Chebmaster View Post
Were you measuring overall frame time or just the time of calling the functions?
Either use external profiling tools (Visual Studio has GPU profiler, also Nvidia/GPU also provide their own tools) or at least measure the average frame latency (not frame rate).

Quote Originally Posted by Chebmaster View Post
You see, I found the experimental way that driver sort of stores commands you issue somewhere inside itself and the real work begins (usually) only after you call SwapBuffers (e.g. wglSwapBuffers, glxSwapBuffers, eglSwapBuffers). So to see what is really going on you have to measure how long the SwapBuffers call takes, with vSync disabled, of course.
This is driver-dependent, so they are free to choose whatever approach is more efficient, but commonly you may expect that the work begins (GPU works in parallel) immediately when you issue a draw call. "Swap Buffers" has to wait for all GPU work to finish before swapping surfaces, which is why you feel it taking most of the time.