PDA

View Full Version : Multi-core/Multithreading your engine...



savage
12-02-2008, 04:44 PM
So how many people here have switched their engines to take advantage of the extra cores that people have in their machines and also multple threads?

I'm looking for a working example of a multi-core/multi-threaded gaming example.

Any links, tips etc welcome.

JSoftware
12-02-2008, 04:58 PM
The multicore bit doesn't make that much sense to me. Multithreaded apps have the possibility of running simultanously so there shouldn't be any difference except for that you have to make absolutely sure that you lock shared variables correctly.

savage
12-02-2008, 05:31 PM
Multicore to me means running on more than one processor.
So you could have 3 threads on processor 1 and 3 threads on a second processor. Multicore processor have more than 1 processor on the die. At least that is my understanding of it all.

JSoftware
12-02-2008, 06:40 PM
Multicore to me means running on more than one processor.
So you could have 3 threads on processor 1 and 3 threads on a second processor. Multicore processor have more than 1 processor on the die. At least that is my understanding of it all.
Well it still doesn't makes any difference to me. If your game is safely multithreaded then it should automatically have the ability to run on more processors simultanously.

The interesting part to me is how many threads is the most optimal number. From a slideshow from a microsoft game developer I've read that the most optimal seems to have a main gamelogic+renderthread, one thread for network communication, one thread for file operations and decompression and one optional for physics. This according to him should match the biggest number of machines(pc's, xboxes, etc) most efficiently.

technomage
12-02-2008, 07:26 PM
Dom

the InfinitEngine is in the process of getting MultiCore support. I've found that free pascal does not support Thread Affinity "out of the box" but then again Delphi doesn't either. Bui'm testing the api for this under Windows, Mac and Linux.

the main reason for adding support for multi cores is that on most operating systems any threads created in the main process will only run on the CPU on which the process is running. So if you want to spread the load across all CPU's (you might have 4) you have to do some fiddling yourself.

Windows does seem to create threads across CPU's already, but historically is not very good at balancing.

To even things out the InfinitEngine will have it's own Thread Manager, which will just keep track of the number of threads the app has on each CPU, and balance them out so each CPU has the same number of threads. Hopefully this will allow the engine to take advantage of multiple cpu's but still work fine on systems with one CPU.

Dean

cronodragon
12-02-2008, 10:15 PM
Although I have used AMD processors for many years, I bought an Intel Dual Core 2 for my last computer upgrade, and I have had plenty of problems with it :( As I use some application it gets slower and slower with time. I can restart and power off the computer, and the application remains slow. With the sysinternal's process explorer I can see how the application's process load increases to the top. I have downloaded the latest drivers, re-flashed the BIOS, and tested the processor with the Intel diagnostic tools and I still can't find the damn problem, which I think is the processor itself, that does some sort of buggy thread balancing.

Anyway, I have been able to deal with those nasty problems, and I was able to make my engine multi-threading. I reserve a thread for the game loop with SetAffinity, but I don't have any demos yet. I have future plans for a multithreading virtual machine for the engine's programming language. :D

By the way, I have had success when synchronizing the VCL, but no luck with the LCL. Does anyone have the same problem? :?

JSoftware
13-02-2008, 12:12 AM
..., which I think is the processor itself, that does some sort of buggy thread balancing.

That, good sir, is the job of the operating system :P

I haven't actually worked with affinity masks before. Wouldn't you risk a lot of cache missing? I don't know how it actually works on windows

arthurprs
13-02-2008, 01:22 AM
single thread :(

Mirage
13-02-2008, 08:32 AM
Today's engine should be multithreaded. There is no choise.
While designing my engine I kept in mind parallelism and soon I'll implement it without too much effort.

NecroDOME
13-02-2008, 09:48 AM
My whole engine is currently multi threaded :) (and you can also run it on one single thread as extra option)
But single core I think.

A question: is multi threaded not directly multi core (processor) supported?

How can I say for example that thread 1 needs to run on core 1 and thread 2 on core 2? this would be a nice feature.

jasonf
13-02-2008, 10:44 AM
My CBCFoundation is single threaded. But it's a 7yr old engine now.

I agree that new stuff should be multi-threaded to take advantage of advances in CPU design without massive code changes. But it's not a simple case of just using threads.. it's a mindset shift in design and debugging techniques.

Luuk van Venrooij
13-02-2008, 11:38 AM
I have been experimenting with Multi-Theading for for the Genesis Device engine for a while now. My current setup has threads for rendering, physics, input and sound.

I haven`t had much performance increase though. With a single thread 95 % of my CPU time wen`t to the renderer. This is basicly the same in my multi thread setup since input and physics have to wait on the renderthread and visa versa. Probebly Ill split the renderer in different threads also. Like updating animations, calculate scene visibility and finally of course the rendercalls.

What I have discovered is that xp and vista do a pretty good job of distributing the threads accross my 2 cores. XP does 2 threads per core, vista does the renderthread on one and the remaining 3 on the other core.

Also I have found a interesting articale on how valve made there engine multi threaded.

http://techreport.com/articles.x/11237

cronodragon
13-02-2008, 02:48 PM
How can I say for example that thread 1 needs to run on core 1 and thread 2 on core 2? this would be a nice feature.

That's what SetAffinity does :D

http://msdn2.microsoft.com/en-us/library/ms686247(VS.85).aspx

LP
13-02-2008, 03:24 PM
In our case, Wicked Defense (including incoming v1.5 release) and another project we are developing are all single-threaded.

These games aren't CPU extensive, so they can run on P3 1.12 Ghz with an average of 50% CPU usage. The graphics, however, is another question. Do we have multi-core GPUs already (that aren't SLIs)? :D

Another challange is Direct3D DrawPrimitive overhead, which even with instancing increases CPU usage and reduces the performance, and is not solved on multi-core CPUs. We have yet to see if DX10 reduces this problem at least partially.

cronodragon
13-02-2008, 03:32 PM
Also I have found a interesting articale on how valve made there engine multi threaded.

http://techreport.com/articles.x/11237

Very interesting article. I remember reading about free-lock algorithms in a Game Programming Gems, now I'm reading a little more:

http://en.wikipedia.org/wiki/Lock-free_algorithms#Implementation

Is there a way to access those atomic primitives in Delphi or even better in FPC, in a multiplatform manner? :?

Mirage
13-02-2008, 04:47 PM
cronodragon: EnterCriticalSection()/TryEnterCriticalSection()?

cronodragon
13-02-2008, 05:00 PM
cronodragon: EnterCriticalSection()/TryEnterCriticalSection()?

"Waits for ownership of the specified critical section object. The function returns when the calling thread is granted ownership.... The threads of a single process can use a critical section object for mutual-exclusion synchronization"

I'm not sure that is the same as this:

"...atomic primitives that the hardware must provide..."

Critical sections are implemented by the system, while atomic primitives are implemented in the hardware, right? :? It seems they do a similar effect, but the idea is that lock-free is only ONE atomic operation, that means speed.

JSoftware
13-02-2008, 05:20 PM
I've always understood critical sections to be an equivalent to semaphores. I haven't understood the difference between the techniques "atomic operations" and semaphores. Most semaphores are implemented using bus locking anyways, which is what atomic operations use too.

Mirage
13-02-2008, 05:56 PM
"Waits for ownership of the specified critical section object. The function returns when the calling thread is granted ownership.... The threads of a single process can use a critical section object for mutual-exclusion synchronization"

This is about EnterCriticalSection right? See TryEnterCriticalSection description. AFAIK it's implemented using system InterlockedCompareExchange function which is implemented via CMPXCHG.
In FastMM sources used CMPXCHG assembler instructions.

savage
13-02-2008, 06:50 PM
What about an Event Based Asynchronous Pattern - http://msdn2.microsoft.com/en-us/library/ms228974.aspx?
Sound like it could be useful, but not sure how practical it would be in game.

AthenaOfDelphi
13-02-2008, 11:53 PM
Multi-core vs. Multi-threaded are distinctly different and require more thought and careful synchronisation/resource protection.

The reason... concurrency, and here's why...



Single Core

Thread 1 XXXX----XXX--------XX
Thread 2 ----XXXX---XXXXXXXX--



Using my really great artwork, what we have above is a simple diagram to illustrate the scheduling of two threads in your app on a single core machine.

As you can see, processing switches between the threads, meaning that at times they will not be executing.


Multi Core

Thread 1 XXXXXXXXXXXXXXXXXXXXXXX
Thread 2 XXXXXXXXXXXXXXXXXXXXXXX



This diagram shows the scheduling of the same two threads, but this time running on a dual core machine. As you can see, both threads are being executed at the same time because thread 1 is on core 1 and thread 2 is on core 2.

Whilst the essence of protecting for multi-core is similar to protecting for multi-threaded, you have to be more mindful of such wonders as deadlock and other timing induced weirdness such as AV's on variables you know you've initialised.

If anyone is new to multithreading and wants some pointers, I did write an article (Multi-threading Part 1 (http://www.pascalgamedevelopment.com/viewarticle.php?a=71&p=1#article)) about the basics of multi-threading.

savage
14-02-2008, 10:44 AM
I just stumbled on this game programming course....
http://users.ece.gatech.edu/~lanterma/mpg/

The great thing is that is has a lot of video material that may be usefull or give others ideas. The multi-core stuff is towards the bottom.

savage
14-02-2008, 01:37 PM
I quite like the following separation



--------------------------------------
| Core | Thread | Game Process |
--------------------------------------
| 0 | 0 | Game Update |
| | 1 | I/0 |
--------------------------------------
| 1 | 0 | Game Rendering |
| | 1 | |
--------------------------------------
| 2 | 0 | Audio |
| | 1 | |
--------------------------------------


The idea being that there would be 2 back buffers BB0 and BB1.
1 . Game Update thread would write to BB0.
2. Once complete, the Game Rendering thread would read BB0 and output to the screen
3. While 2 is going on, The Game Update would start writing to BB1

So therefore GU ( Game Update ) is only ever writing to the one of the back buffers, while GR ( Game Rendering ) is only ever reading from one of the back buffers. This then alleviates any read write conflicts.

Obviously GU should be analysed further to see if there is any scope for more parralelism in terms of collision detection, physics or AI that could split up.

Now how to actually get this working and cross-platform, is another matter.

Anyone see any problems with the above?

cronodragon
14-02-2008, 02:53 PM
Anyone see any problems with the above?

I would let I/O run on any available core, not restricted for time critical engine operations, being it the most time consuming operation.

JSoftware
14-02-2008, 03:23 PM
Using opengl you can only reference the window that has been opened in the same thread

This has a few complications with loading and updating stuff in the game update thread

NecroDOME
14-02-2008, 04:27 PM
My code almost works as follows:


one thread for rendering and updating the controls (called the main thread) and several threads for updating the game logic/physics.

So What I basically do in the game loop is:
Main thread - Update input
Main thread - Start worker threads
Worker thread - Update some stuff 1
Worker thread - Update some stuff 2
Worker thread - Update some stuff 3
Main thread - Wait till all worker threads are finished (synchronize)
Main thread - Render

Maybe I can update the input while im rendering. U don't think ist a good idea to update the game logic/physics while rendering. This may cause (if you have a low frame rate) some things update to late or to early. For example: in a race game when you are driving behind a car, it may look like the car is "jumpy".
May be this can be done for none important stuff like particles.

NecroDOME
14-02-2008, 04:31 PM
What about an Event Based Asynchronous Pattern - http://msdn2.microsoft.com/en-us/library/ms228974.aspx?
Sound like it could be useful, but not sure how practical it would be in game.

Currently where I work the are implementing this in an application. It can be used in games, but I would say: don't do it with small pieces of code, but for example: take particles, physics and game logic as separate updates. (This would make 3 threads)

savage
14-02-2008, 06:02 PM
These links talk about multi-core programming in relation to XBox 360, but should be applicable to other platforms...

Coding For Multiple Cores on Xbox 360 and Microsoft Windows
http://msdn2.microsoft.com/en-us/library/bb204834(VS.85,printer).aspx

Lockless Programming Considerations for Xbox 360 and Microsoft Windows
http://msdn2.microsoft.com/en-us/library/bb310595(VS.85,printer).aspx