View Full Version : Fast Normalisation

Nitrogen

15-01-2007, 09:55 AM

I'm busy writing my particle engine and it's nearly done but there is one thing I'm hoping to eliminate:

You know when you emit particles all with a random velocity, if you are using normal X,Y,Z coordinates they tend to emit outwards in a cube-like fashion?

Those particles at the corners of this 'cube' have the most extreme X, Y and Z values and so their overall magnitudes are a little higher than the rest... I'm looking for a nice spherical emission pattern...

Now, the one way to solve this is to normalise the velocity vector every time you calculate or change it, and then figure out some magnitude to apply to the vector, which is a bit of a drag.

So can anyone either think up a better way around this problem without changing my data structures? Everything works off X,Y,Z coords, not Direction and Magnitude for instance..

Or better yet, think up an insanely fast normalisation algorithm? (I declare the 3rd small-fast-optimisation-competition open!)

dmantione

15-01-2007, 10:25 AM

Well, when normalizing you divide the vector by its length. Instead of calculating the length over and over again, you might cache it. For example, when you scale a normalized vector, you know the size changes by the scale factor. So instead of calculating the length, you simply divide your vector by the scale factor, resulting again in a normalized vector.

JSoftware

15-01-2007, 10:56 AM

type

tvector4f = record

x, y, z, w: single;

end;

function Normalize(vec: tvector4f): tvector4f;

asm

movups xmm0, [vec]

movaps xmm1, xmm0

mulps xmm1, xmm1

movaps xmm2, xmm1

shufps xmm2,xmm2, $39

addss xmm1, xmm2

shufps xmm2,xmm2, $39

addss xmm1, xmm2

sqrtss xmm1, xmm1

shufps xmm1, xmm1, $00

divps xmm0, xmm1

movups [result], xmm0

end;

I cheated with SSE 8)

Setharian

15-01-2007, 07:04 PM

slightly faster probably...

function Normalize(vec: tvector4f): tvector4f;

asm

movups xmm0, [vec]

movaps xmm2, xmm0

mulps xmm0, xmm0

movaps xmm1, xmm0

shufps xmm0, xmm1, $4E

addps xmm0, xmm1

movaps xmm1, xmm0

shufps xmm1, xmm1, $11

addps xmm0, xmm1

rsqrtps xmm0, xmm0

mulps xmm2, xmm0

movups [result], xmm2

end;

Nitrogen

15-01-2007, 07:45 PM

Does it have to be a 4D vector?

What do I have to do to make it work with

type vector = array[0..2] of single;

JSoftware

15-01-2007, 08:01 PM

Hmm good question. I just tried three components and sse seems to be pretty slow with the code above.

I thought that sse would check boundaries when you used movups but it seems it doesn't

@Setharian, my initial benchmarks shows you beat me with 7% :P

Seems I need to redesign some of my other sse functions in my vector library

Edit: wait a minute. What's going on in my code...

Edit2: Further optimizing got me this superfast code :D

function Normalize3(vec: tvector4f): tvector4f;

asm

movups xmm0, [vec]

movaps xmm3, xmm0

mulps xmm0, xmm0

shufps xmm1, xmm0, $00

shufps xmm2, xmm0, $10

addps xmm1, xmm0

addps xmm2, xmm1

rsqrtps xmm2, xmm2

shufps xmm2, xmm2, $AA

mulps xmm3, xmm2

movups [result], xmm3

end;

You will need a fourth component to use sse. If you use Turbo delphi(or fpc or any pascal language with operator overloading) then you could create an implicit overload of a record which transparently will create a four component vector and the other way

Mirage

16-01-2007, 04:44 PM

Now, the one way to solve this is to normalise the velocity vector every time you calculate or change it,

I think there is no need to normalize on each change.

Once emit all the particles with the same speed and they will fly nicely.

If you wouldn't to use SSE consider the inverse square root (1/SQRT(x)) :

function InvSqrt(x: Single): Single;

var tmp: LongWord;

begin

asm

mov eax, OneAsInt

sub eax, x

add eax, OneAsInt2

shr eax, 1

mov tmp, eax

end;

Result := Single((@tmp)^) * (1.47 - 0.47 * x * Single((@tmp)^) * Single((@tmp)^));

end;

So can anyone either think up a better way around this problem without changing my data structures? Everything works off X,Y,Z coords, not Direction and Magnitude for instance..

Also you can set velocity in spherical coordinates. Just generate randomly two angle phi and theta and constrruct a velocity vector:

Phi := Random*Pi*2; Theta := Random*Pi;

Velocity := GetVector3s(Cos(Phi)*Sin(Theta)*Speed, Sin(Theta)*Speed, Sin(Phi)*Sin(Theta)*Speed);

Sin/Cose values can be looked-up in a precalculated table.[/pascal]

Nitrogen

16-01-2007, 09:54 PM

Ok cool, I'll try it...

Thanks guys.

This is what the particle system is looking like so far:

I've just got reemitters working which spawn sub-particles off of the main particles.

http://www.nitrogen.za.org/gallery/Particles4.jpg

Powered by vBulletin® Version 4.2.5 Copyright © 2019 vBulletin Solutions Inc. All rights reserved.