Currently host_amd64_defs.* list RAX, RCX and RDX as not available to the register allocator. I think this is a mistake and they should be made available. The more Int64 registers available to the register allocator, the better is the final produced code. Running Valgrind regression test suite shows that %rcx and %rdx can be enabled with no problems. For %rax there is some unimplemented functionality (search for FIXME in host_amd64_defs.c)
Created attachment 107840 [details] patch
Unfortunately performance measurements do not confirm this as a good idea. Measuring Memcheck on perf/bz2, instruction count: v3 baseline: 45,110 M total; 168 M register allocator v3 patched: 45,123 M total; 176 M register allocator v2 baseline: 45,190 M total; 209 M register allocator v2 patched: 45,266 M total; 220 M register allocator Measuring Memcheck on tinycc, instruction count: v3 baseline: 4,155,471 k total; 86,438 k register allocator v3 patched: 4,155,207 k total; 91,193 k register allocator I'd conclude that although two more registers are available to the register allocator, it does not help because: - they are caller saved registers - register allocator needs to iterate over more registers
Yes, that sounds pretty similar to what I found too. We maybe should re-evaluate this when the control-flow-diamond stuff comes to life, because it might be the case that that allows us to have longer sequences of instructions between helper calls and so there is more demand for registers.
Even after bug 384987 has been fixed, the situation did not improve much. Running Memcheck on perf/bz2 [number of instructions]: baseline: total: 44,936,961,809; reg alloc: 168,848,160; ratio: 15.3 patched: total: 44,933,693,157; reg alloc: 177,659,739; ratio: 15.3 Running Memcheck on perf/tinycc: baseline: total: 3,938,404,232; reg alloc: 121,447,931; ratio 16.2 patched: total: 3,939,903,162; reg alloc: 127,930,047; ratio 16.2 With perf/bz2, there is only a very minor improvement. With perf/tinycc there is actually a very minor deterioration. I am abandoning this bug for good.