Ah well I'm pretty pathetic at graphics, so I prefer to get the computer to do as much work for me as possible.

I rolled back to the previous version of unDelphiX that I was using. That fixed the surface rendering problem, but I was still having major speed hits when rendering to a surface. It seems that point I made in my original post was correct, that hardware acceleration is not available on any surface other than DXDraw:

Code:
TDirectDrawSurface.DrawRotate Function:
...

  If AsSigned(D2D) Then Begin
   If D2D.CanUseD2D Then Begin
    If D2D.FDDraw.Surface = Self Then Begin
      D2D.D2DRenderRotateDDS(Source,X,Y,Width,Height,CenterX, CenterY,Angle,Transparent);
      Exit;
    End;
   End;
  End;

  // Else render using software
The line "If D2D.FDDraw.Surface = Self Then Begin" is checking if the current object doing the rendering is the DXDraw object. If it isn't, then the graphic is drawn to the surface using software, which is ludicrously slow.

I've found a solution, but it's a bit of a hack. I'm now running two scenes per frame. In the first, I run all the collission detection using the DXDraw's surface, but never flip anything to the main display. Not exactly the ideal I was looking for, but it's much faster than software rendering to surfaces.