Quote Originally Posted by Chebmaster View Post
Just don't forget: on modern hardware this vintage method has the same speed(x86) or is three times *slower*(Raspberry Pi) than honest 1/sqrt(x) while the dedicated SSE instruction RSQRTPS leaves it in the dust, being more than 8 times faster (although it is not deterministic) because it does that trick *in hardware* operating on 4 floats in parallel.

See my research results @ http://openarena.ws/board/index.php?topic=5379.0

So, be wary of stale methodologies and remember: modern hardware demands different optimizations!
This is gold, i didn't even consider this, and yeah we did come into era of 8 and 16 core processor with massive IPCs and internal optimizations, i guess it is time to put some of these old methods to rest.