PDA

View Full Version : Fast Normalisation



Nitrogen
15-01-2007, 09:55 AM
I'm busy writing my particle engine and it's nearly done but there is one thing I'm hoping to eliminate:

You know when you emit particles all with a random velocity, if you are using normal X,Y,Z coordinates they tend to emit outwards in a cube-like fashion?
Those particles at the corners of this 'cube' have the most extreme X, Y and Z values and so their overall magnitudes are a little higher than the rest... I'm looking for a nice spherical emission pattern...

Now, the one way to solve this is to normalise the velocity vector every time you calculate or change it, and then figure out some magnitude to apply to the vector, which is a bit of a drag.

So can anyone either think up a better way around this problem without changing my data structures? Everything works off X,Y,Z coords, not Direction and Magnitude for instance..

Or better yet, think up an insanely fast normalisation algorithm? (I declare the 3rd small-fast-optimisation-competition open!)

dmantione
15-01-2007, 10:25 AM
Well, when normalizing you divide the vector by its length. Instead of calculating the length over and over again, you might cache it. For example, when you scale a normalized vector, you know the size changes by the scale factor. So instead of calculating the length, you simply divide your vector by the scale factor, resulting again in a normalized vector.

JSoftware
15-01-2007, 10:56 AM
type
tvector4f = record
x, y, z, w: single;
end;

function Normalize(vec: tvector4f): tvector4f;
asm
movups xmm0, [vec]
movaps xmm1, xmm0
mulps xmm1, xmm1
movaps xmm2, xmm1
shufps xmm2,xmm2, $39
addss xmm1, xmm2
shufps xmm2,xmm2, $39
addss xmm1, xmm2
sqrtss xmm1, xmm1
shufps xmm1, xmm1, $00
divps xmm0, xmm1
movups [result], xmm0
end;

I cheated with SSE 8)

Setharian
15-01-2007, 07:04 PM
slightly faster probably...

function Normalize(vec: tvector4f): tvector4f;
asm
movups xmm0, [vec]
movaps xmm2, xmm0
mulps xmm0, xmm0
movaps xmm1, xmm0
shufps xmm0, xmm1, $4E
addps xmm0, xmm1
movaps xmm1, xmm0
shufps xmm1, xmm1, $11
addps xmm0, xmm1
rsqrtps xmm0, xmm0
mulps xmm2, xmm0
movups [result], xmm2
end;

Nitrogen
15-01-2007, 07:45 PM
Does it have to be a 4D vector?
What do I have to do to make it work with



type vector = array[0..2] of single;

JSoftware
15-01-2007, 08:01 PM
Hmm good question. I just tried three components and sse seems to be pretty slow with the code above.

I thought that sse would check boundaries when you used movups but it seems it doesn't

@Setharian, my initial benchmarks shows you beat me with 7% :P
Seems I need to redesign some of my other sse functions in my vector library

Edit: wait a minute. What's going on in my code...
Edit2: Further optimizing got me this superfast code :D

function Normalize3(vec: tvector4f): tvector4f;
asm
movups xmm0, [vec]
movaps xmm3, xmm0
mulps xmm0, xmm0
shufps xmm1, xmm0, $00
shufps xmm2, xmm0, $10
addps xmm1, xmm0
addps xmm2, xmm1
rsqrtps xmm2, xmm2
shufps xmm2, xmm2, $AA
mulps xmm3, xmm2
movups [result], xmm3
end;

You will need a fourth component to use sse. If you use Turbo delphi(or fpc or any pascal language with operator overloading) then you could create an implicit overload of a record which transparently will create a four component vector and the other way

Mirage
16-01-2007, 04:44 PM
Now, the one way to solve this is to normalise the velocity vector every time you calculate or change it,

I think there is no need to normalize on each change.
Once emit all the particles with the same speed and they will fly nicely.

If you wouldn't to use SSE consider the inverse square root (1/SQRT(x)) :
function InvSqrt(x: Single): Single;
var tmp: LongWord;
begin
asm
mov eax, OneAsInt
sub eax, x
add eax, OneAsInt2
shr eax, 1
mov tmp, eax
end;
Result := Single((@tmp)^) * (1.47 - 0.47 * x * Single((@tmp)^) * Single((@tmp)^));
end;


So can anyone either think up a better way around this problem without changing my data structures? Everything works off X,Y,Z coords, not Direction and Magnitude for instance..

Also you can set velocity in spherical coordinates. Just generate randomly two angle phi and theta and constrruct a velocity vector:
Phi := Random*Pi*2; Theta := Random*Pi;
Velocity := GetVector3s(Cos(Phi)*Sin(Theta)*Speed, Sin(Theta)*Speed, Sin(Phi)*Sin(Theta)*Speed);

Sin/Cose values can be looked-up in a precalculated table.[/pascal]

Nitrogen
16-01-2007, 09:54 PM
Ok cool, I'll try it...

Thanks guys.

This is what the particle system is looking like so far:
I've just got reemitters working which spawn sub-particles off of the main particles.

http://www.nitrogen.za.org/gallery/Particles4.jpg