Today I profiled fillRect a bit:
Protocol generation: 120 cycles (40%)Protocol generation is writing the X11 protocol into a sun.misc.Unsafe.
Pipeline overhead : 90 cycles (30%)
Locking/synchronization: 90 cycles (30%)
Total: 300 cycles with server compiler (480 cycles with client compiler; ~20.000 cycles interpreter-only)
Pipeline-Overhead is all the work done to validate pipeline/surface state and decide which code-path to use for the current operation, as well as all the abstraction from Graphics2D up to our XRender Surface.
Locking means aquiring/releasing a ReentrantLock, which guards AWT access.
- 300 cycles is not that well, however we are generating rectangles probably faster the XServer can process it :). After specific optimizations as well as biased locking I guess 175 cycles is realistic, which is not that bad.
- The server-compiler does pretty well, hopefully tiered compilation will be implemented for JDK7. In this case the client-compiler produces 60% slower code :-/
- Locking is expensive, especially on older muti-core processors (like my Core2Duo). Biased locking could really help here, unfourtunatly it has a limitation which make it hard to use for the pipeline.
Furthermore it seems some optimizations don't have any effect when locking is done, but show e.g. 10 cycles improvement when no locking is done.
- The pipeline-overhead could be lower, but its not bad.