One thing you should definitely try is to embed any code that accesses the surface memory into your code. The big problem with the GetPixel function is that you can pass any X and Y coordinates to it but your routine will only work on a small part of the surface. The GetPixel routine therefore performs multiplications that wouldn't be necessary if you'd embed the code into your own routine. (You only need to do multiplications after each line. To move from pixel to pixel, all you need to do is add 2 to your pointer).
What I believe is more costly than GetPixel is the calculations you make to shade the pixels. You might be able to optimize that a bit by using shifts instead of divisions or by using assembler.
Another idea would be to use a 32 bit color mode instead of a 16 bit color mode because if you modify color values in 16 bit mode, that will generally require some sort of processing whereas in 32 bit mode, this wouldn't be necessary.

As far as I know, PowerDraw uses Direct3D for drawing graphics. This will make many things faster but in general, surface pixel access does not benefit from 3D acceleration.