The key to doing this is to observe that the pixmap compiler which is driven by the put_sprite_scaled call (OS_SpriteOp 52) can do several clever things:
a) omit colour remapping
b) omit x and/or y size remapping
c) omit itself entirely and call put_sprite (OS_SpriteOp 34) (only when a) and b) completely omitted)
BUT it can only do these things if given some helpful hints.
In order to provoke action a), you need to tell it that the pixel translation table should not be used - this can be done in the 1-1 map situation by using code such as:
FORQ%=0TO255:pixtrans%?Q%=Q%:NEXT
SYS "ColourTrans_SelectTable",m,palptr%,-1,-1,pixtrans%
spx%=-1:FORQ%=0TO255:IFpixtrans%?Q%<>Q% spx%=pixtrans%
NEXT
and then using spx% as the pixel translation table (r7) value for op 52. When spx% is -1 the pixel translation step will be omitted and this at least doubles the speed of the sprite plot.
Action b) comes from correctly specified scale factors: reduce them to the lowest terms and its happy.
Then action c) happens "as if by magic" and your sprites are plotted faster when the wind is behind them. And in any mode when it isn't. Action c) can be sped up by calling op 34 yourself having recognised that op 52 will be doing this - this avoids op 52 making several read mode variable calls.