Yesterday I finally got MaskFills working:
It took me quite some time until I figured out howto upload the mask-tiles using the lock/getrasinfo/unlock functions, and I guess my code is full with wrong assumptions ;) ... but its working quite well.
Performance with XAA (no accaleration) is most of the time on par with the existing implementation (there seem to be some performance bugs, sometimes its fast and sometimes not), performance on EXA is far better. However this approach has a big benefit for accalerating drivers (EXA), where the old approach had a large penality because of the readbacks used.
Although still intermediate-mask-images are used to render AA shapes and fills (instead of using XRender's AA capabilities), there's now no download from X-Server, and the upload is almost 1/3. Furthermore there is currently no EXA-driver which is able to accalerate geometry (they use a similiar approach like Java to render AA traps).
Today I'll play with lcd-text again and have a look at MaskBlit.