I was able to incorporate (4) + (5) without any trouble at all. (3) caused a small issue with the offset of the image's dst orientation until I factored in RX and RY into it. Works fine now.
Great!

(1) I didn't fully understand... I think I need it explained to me in detail.
This is closely connected to your notice that you need to speed-up SDL_AddPixel (more about this later). My idea was to call SDL_GetPixels, so you get all pixels of specified rectangle as DWORDs and you do your calculations in 32bit format. After that, you call SDL_AddPixels, and it transforms pixels back from 32bit to whatever format you use.

And (2) I understand, but haven't been able to properly incorporate it without major issues (the function stops working all together so far)
Hm, try using more bits for precision. Here's more refactored version:

const
PRECISION_BITS = 10;
PRECISION_ONE = 1 shl PRECISION_BITS;
PRECISION_HALF = 1 shl (PRECISION_BITS - 1);

aCos := Round (degCOS[Angle] * PRECISION_ONE);
aSin := Round (degSIN[Angle] * PRECISION_ONE);

NX := mx + (RX * aSin + RY * aCos + PRECISION_HALF) shr PRECISION_BITS;
NY := my + (RY * aSin - RX * aCos + PRECISION_HALF) shr PRECISION_BITS;

Now, try using 15 or 16 bits, but you should keep in mind that all of your calculations should stay inside 32 bits.

There is one thing I'd like to try... and thats to replace SDL_AddPixel in form SDL_RotateDeg_AddAlpha() to try and optimize the calculation of the RGB values. Something I noticed was that the most intensive part of both of these functions is the calculation of the new color. And here I'm trying to do 2 sets of calculations instead of just one.
One thing you should definitely do is to remove case inside your SDL_AddPixel. One way to do this is to have

type TAddPixelProc = procedure( DstSurface : PSDL_Surface; x : cardinal; y : cardinal; Color : cardinal );

and inside format structure, you have

AddPixelProc: TAddPixelProc;

So, instead of calling SDL_AddPixel(...), you would call

DstSurface.format.AddPixelProc(...)

This way, you will eliminate format-check performed for every pixel. Of course, you'll need specific versions SDL_AddPixel8, SDL_AddPixel15, ...

Hope this will help a bit