Updated the file twice, second time after a little code cleaning.
Updated the file twice, second time after a little code cleaning.
Nice demo! Few notes:
-too many ops per one star, some reduction helps, for example
- division is one of the slowest ops you can do (just this one line change makes it 20% faster)Code:d := G_FORCE/(d*d*d);
-no use of more than 2 calculation threads on dual-core, more threads than cores makes it slower actually
-Application.Processmessages is an invitation for reentrancy problems, and at least use interlocked decrement on the thread counter (thread joining would be better, sleep really isn't a synchronization primitive)
my projects https://github.com/dpethes
Nice notice (fixed it into my version)
It does mean same thing as:Code:d := G_FORCE/(d*d*d);
Also there is no Application.ProcessMessages in that version (you can remove "forms" from gameunit uses list). It was replaced byCode:f:=G_FORCE/(d*d); d:=f/d;
in the physics main thread. It still doesn't hang the application because it's a separate thread. Spiking in framerate is simply because of heavy utilization of CPU i think. Trying with this seems to work too, but i feel like it is actually consuming CPU resources more with loop like thatCode:while Threads>1 do sleep(1);
So with this i see an increase in physics loop time.Code:while Threads>1 do ; // <- Don't do this
It should be ok to have 4 threads even on dualcore, remember that each thread gets different workload. Meaning that other CPU could be without work for longer time, if i only had 2 threads. At the same time people with quadcores can test it too. It might even be faster with 8 threads.
Also works better if i comment out SetFrameSkipping(false); Showing more realistic framerate (~40) numbers too, when it's not pushed to draw as much as possible.
Last edited by User137; 14-06-2013 at 06:12 PM.
Empty while loop is worse, for sure: use TThread.WaitFor. You probably want something like this in your TPhysicsMainThread:
Code:threads: array [0..MAX_THREADS-1] of TPhysicsThread; ... stars_per_thread := count div MAX_THREADS; for i := low(threads) to high(threads) do threads[i] := TPhysicsThread.Create(FParent, i * stars_per_thread, (i + 1) * stars_per_thread - 1); for i := low(threads) to high(threads) do begin threads[i].WaitFor; threads[i].Free; end;
Last edited by imcold; 14-06-2013 at 06:22 PM.
my projects https://github.com/dpethes
Wow, this became quite a practise for threads You are right again. But also with this change i could test how 32 threads would effect, and as expected, it seemed just as fast as 4 or 8 threads. And minor difference in the creation, to support odd particle counts aswell
New version is also uploaded, and there is extra test in rendering loop commented out. I tried what it looks like when full screen clear would be replaced with slow fading to black. Particles can leave kind of trails when they move. I'm not sure if it looks better than original though, so that's why it's commented.Code:for i:=0 to MAX_THREADS-2 do pt[i]:=TPhysicsThread.Create(FParent, i*tcount, (i+1)*tcount-1); pt[MAX_THREADS-1]:=TPhysicsThread.Create(FParent, (MAX_THREADS-2)*tcount, count-2);
Also (a bit akin what Dan suggested) you can play with the condition affecting when to apply the force to particle:
With distance set to 15 or so gives a nice speedup in later stages, when the particles spread out.Code:if (d > 0) and (d < DISTANCE_TO_IGNORE)
Smaller distances (try 5) will give you more speedup in early stages but have a more noticeable effect on particle behavior (less tight clusters).
Larger distances (100) cause slowdown in early stages
Dynamically tuning the "ignore distance" would probably lead to best results (2xfaster execution shouldn't be a problem).
As per more calculation threads than cores: if threads with similar workload start competing for resources, you'll get slowdowns from overheads associated with threads, scheduling, context switching, cache trashing and so on.
Edit: you have an error within the last statement, (MAX_THREADS-2) should be (MAX_THREADS-1), otherwise the last thread will get more stars than it should (the same ones as in the prev. thread, not to mention it won't work with one thread
Last edited by imcold; 14-06-2013 at 08:25 PM. Reason: code review
my projects https://github.com/dpethes
Just noticed that myself. I was testing with randseed:=1 just to see if i get same results with 1 thread or many. It crashed the moment i tried to set it on 1 thread Also 1 thread didn't result same kind of universe than 2 or more threads.
But that's fixed now. I added more debug info, like how many physics ticks are calculated, print out how many threads are used. Changed the visuals slightly, so that blue particles aren't add-blended, and there is the slow fade-to-black only when it is unpaused.
The visuals are really cool Final note for the threads - the load isn't distributed evenly between them, given the same amount of stars, the one with lower starting index does more cycle iterations. That's most likely why you see a speedup with more threads - the bigger the discrepancy, the more difference thread count does. Also, physicsthread.WaitFor on formClose; no need for thread counter at all.
my projects https://github.com/dpethes
Bookmarks