Nice demo! Few notes:
-too many ops per one star, some reduction helps, for example
Code:
d := G_FORCE/(d*d*d);
- division is one of the slowest ops you can do (just this one line change makes it 20% faster)
-no use of more than 2 calculation threads on dual-core, more threads than cores makes it slower actually
-Application.Processmessages is an invitation for reentrancy problems, and at least use interlocked decrement on the thread counter (thread joining would be better, sleep really isn't a synchronization primitive)