Page 2 of 2 FirstFirst 12
Results 11 to 16 of 16

Thread: Jengine - OpenGL 3.3/4.x engine & JUI progress

  1. #11
    PGD Staff / News Reporter phibermon's Avatar
    Join Date
    Sep 2009
    Location
    England
    Posts
    524
    Hmm, well giving it thought the one feature I just can't justify loosing are Uniform Buffer Objects (change once, applies to all shaders that use it opposed to setting uniforms for each single shader. Think of them like a single instance of a customizable record that you can share across multiple shaders) which were introduced in GL3.1, so I'll do some damage control and see how much work it would represent to make 3.1 the lower dependancy. You made an excellent point about sandy bridge : I was not aware that the on-chip graphics were 3.1/3.2, I assumed 2.1. For that reason I shall have to look into it, I can sit pretty knowing that cheap 4.x cards will soon dominate but that on-die intel monstrosity is going to be the only solution a lot of laptop users will have for the next few years.

    Carver : I wouldn't like to offend those pursuing ES as their route to GL3/4 will be a lot easier than those coding in immediate mode 2.x, but yes I'd agree with that statement. It's not just the performance gains but it's the usability too.

    My terrain engine was nearly effortless with GL4.0.

    LOD, low level culling etc are all done on the GPU and as a result can sit exactly where they need to for the simplest approach. Older CLOD systems (Roam etc) are far more complex, doing all they can to minimize the bottleneck of constantly transfering vertices to the card from the system. That's just not an issue with tesselation; you just send a sparse patch mesh and tesselation+displacment does the rest with, more or less, free seamless welding of patch edges.
    Last edited by phibermon; 14-06-2011 at 02:21 PM.
    When the moon hits your eye like a big pizza pie - that's an extinction level impact event.

  2. #12
    PGD Staff code_glitch's Avatar
    Join Date
    Oct 2009
    Location
    UK (England, the bigger bit)
    Posts
    933
    Blog Entries
    45
    Or you could have abump at intels market share with another strategy: do it all in 4.x and make a 'crapo' mode where it has a very basic, quickly implemented set of shaders for 2.x/3.x and make a really good game - that way either intel gets some serious opnegl umpf, or ati/nvidia get some market boosts. Either way everybody wins

    But yes, I was dissapointed when sandy bridge (the creme de la creme) from intel came out with 3.1/3.2 support and ATI/NVidia cards had that since... well, the dawn of time. OK, not really, but a while now.

    Mind you, sandy bridge is the only GMA chip that can render something fast enough for it to even be visible to humans. (sorry gma fans - whoever you may be)

    Anyway, good luck and those features do indeed sound tempting.
    I once tried to change the world. But they wouldn't give me the source code. Damned evil cunning.

  3. #13
    PGD Staff / News Reporter phibermon's Avatar
    Join Date
    Sep 2009
    Location
    England
    Posts
    524
    hehe you might have somthing there. I've been looking at various CLOD techniques that could be used for <GL4.0. The only ones I'd be happy with from a technical stand-point are either a GPU optimized Geo-Clipmapping :

    http://research.microsoft.com/~hoppe/gpugcm.pdf

    or this :

    http://vertexasylum.com/2010/07/11/o...ndering-paper/

    The latter looks suprisingly similar in wireframe mode as my GL 4 technique and by my rough estimates is not that far off the FPS. However, it requires extensive pre-processing of the terrain dataset and does *far* more work on the CPU (which is not quite fair, the techniques I've employed (very nearly) don't use the CPU at all)

    And to top it all off, it's complicated to implement although a port of the source provided would be possible given enough Direct3D research.

    So to support older cards for the terrain, I'll simply brute force render (with a bit of culling) and drop both the poly count and the draw distance (like I'm doing) until it matches the FPS.

    If I did implement an alternate CLOD for older cards, I would most likely choose geo-clipmapping as I can use the same dataset as I use now without the pre-processing of the preffered technique. (it really is very impressive though, check it out if you have the time)
    Last edited by phibermon; 14-06-2011 at 04:47 PM.
    When the moon hits your eye like a big pizza pie - that's an extinction level impact event.

  4. #14
    PGD Staff code_glitch's Avatar
    Join Date
    Oct 2009
    Location
    UK (England, the bigger bit)
    Posts
    933
    Blog Entries
    45
    Hmm... Although I am totally for GPGPUs and the GPU over the CPU (just look at those giga/teraflops of an advantage) it raises one problem I've had to contend with a few times, and many gamers too: a top spec cpu is not a bottleneck, you can still make a decent gaming rig out of C2Ds since its all on the GPU. That way you save £100 or so on your cpu, just make sure you buy the extra HD5990 to make up for it

    It come down to the old cpu time vs ram again doesn't it? You can put up a loading screen, compute most it all before hand and store it in ram at the expense of many MBs (not too much of an issue for a lot of people that now have >2GB). Or you can compute it on the fly and save RAM (back in the XP days of 512MBs). On low GPU machines its a case of: RAM & wait first, or CPU and wait a little bit the whole time. Heck, I'm running into this with my libs now: indexing lists for example. just how much do you index? what is a good CPU:RAM usage ratio. Why couldn't it ever be simple right?
    I once tried to change the world. But they wouldn't give me the source code. Damned evil cunning.

  5. #15
    PGD Staff / News Reporter phibermon's Avatar
    Join Date
    Sep 2009
    Location
    England
    Posts
    524
    Again good point. I suppose all these differrent techniques are only really making different compromises between CPU/GPU performance and available memory/bandwidth.

    In my case I'm lucky enough to own an I7 950 which is suprisingly fast (after moving from my previous fastest, an Atom 330). The GPU I have is an Nvidia GTX 460 which is towards the lower-end of the budget spectrum and should represent the average card within say, a year (in terms of performance, functionality)

    My general goal was to implement GPU heavy techniques until a bunch of average entries in the scene graph brings the framerate below 200 FPS.

    Then I'll focus on getting the most CPU heavy techniques isolated in seperate threads, primarily that'll be physics, path-finding and steering.

    I've not yet examined OpenCL/CUDA and probably won't. There's enough C syntax in the shaders without polluting the rest of the engine/framework. And again as in you stated, OpenCL/CUDA is yet another comprimise, taking compute power away from shaders. While this is an excellent choice in dual GPU setups, Mafia 2 for instance is not as fast as I'd hoped when PhysX is turned on to max (PhysX is just a CUDA program underneath). But GTA 4, while not quite the same the level of physical interactions; is silky smooth with it's CPU side physics and arguably higher poly counts and that's with the incredible euphoria engine doing all it's funky inverse kinematic kind of things as well.

    So you've hit the nail on the head there : A good engine is a good balance between various bottlenecks on average systems. A great engine can utilize different techniques to balance the bottlenecks on a wide range of hardware (and I think we've seen a big shift in recent years from CPU to GPU side bottlenecks in terms of what is demanded of a game)
    When the moon hits your eye like a big pizza pie - that's an extinction level impact event.

  6. #16
    PGD Staff code_glitch's Avatar
    Join Date
    Oct 2009
    Location
    UK (England, the bigger bit)
    Posts
    933
    Blog Entries
    45
    Hem... I believe intel put the price according to the
    I7 950 which is suprisingly fast
    part... Ok, so its on of THE fastest processors around (I can't afford one so the gaming rig build will most likely be a phenom 2 955 BE OCd) and the GTX460 might be cheap(er) than launch, but let me say this: it still trumps a good portion of the market - and I would be worried if thats the market in a year my 4330 really would be up the wall that time...

    Now OpenCl is interesting in itself - the way I see it its' a case of GPGU, do we want that? Anyone? Ah yes, you sir mr higher end ATI and Nvidia. Any others? No? Oh well, a portion of th market can enjoy. Although I'm totally for making use of the teraflops of GPU power many people now have or at the very least hundreds of megaflops - unless it fits into every language, is nicer than the first iterations of OpenGl and works on everything (eg. no nvidia bugs in early iterations with EXTs and FBOs) then fine. But would you like to learn a whole new ideology for 20% of the market to get better performance than the best available (which they are already getting) and write it all again normally? Probably not.

    Besides that: since when has CPU power been limiting in games? Since many apt gaming rigs still sport C2Ds and we talk more about GPUs for games than CPUs - GPGPUs are a problem: take load where there is not enough of it and move it to an area that is already under too much load? Logical isn't it. How come ATI cards do good gaming without nvidia physx? - the same reason you mentioned: GPU bottlenecks.

    The approach you're taking is nice indeed, though, one thing at a time and all and I'll definitely be taking a look into the code - perhaps run it even (once I get a system that can ). Have to say, though, the i7 is futureproofing gone a long way for CPU power. What I would like to see though is a different matter: OpenGCL - Open generalised computing language. AKA. A language/set of headers that pools together ALL resources of a system (audio processor, GPU and CPU) and run everyhting on whatever is best suited/has the most available computing power. That way its a case of: your system has one speed rating: generalised power. No RAM, CPU, GPU etc. Just one figure, one language, and one bottleneck: overall power. Now that would really make my day.
    I once tried to change the world. But they wouldn't give me the source code. Damned evil cunning.

Page 2 of 2 FirstFirst 12

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •