Bug 384676 - VEX AMD64 backend should list more real registers as available for the register allocator
Summary: VEX AMD64 backend should list more real registers as available for the regist...
Status: RESOLVED INTENTIONAL
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.14 SVN
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Ivo Raisr
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-13 20:41 UTC by Ivo Raisr
Modified: 2017-10-19 18:20 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
patch (6.77 KB, patch)
2017-09-13 21:06 UTC, Ivo Raisr
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ivo Raisr 2017-09-13 20:41:38 UTC
Currently host_amd64_defs.* list RAX, RCX and RDX as not available to the register allocator. I think this is a mistake and they should be made available.

The more Int64 registers available to the register allocator, the better is the final produced code.

Running Valgrind regression test suite shows that %rcx and %rdx can be enabled with no problems. For %rax there is some unimplemented functionality (search for FIXME in host_amd64_defs.c)
Comment 1 Ivo Raisr 2017-09-13 21:06:59 UTC
Created attachment 107840 [details]
patch
Comment 2 Ivo Raisr 2017-09-14 11:26:47 UTC
Unfortunately performance measurements do not confirm this as a good idea.

Measuring Memcheck on perf/bz2, instruction count:
v3 baseline: 45,110 M total; 168 M register allocator
v3 patched:  45,123 M total; 176 M register allocator
v2 baseline: 45,190 M total; 209 M register allocator
v2 patched:  45,266 M total; 220 M register allocator

Measuring Memcheck on tinycc, instruction count:
v3 baseline: 4,155,471 k total; 86,438 k register allocator
v3 patched:  4,155,207 k total; 91,193 k register allocator

I'd conclude that although two more registers are available to the register allocator, it does not help because:
- they are caller saved registers
- register allocator needs to iterate over more registers
Comment 3 Julian Seward 2017-09-14 13:11:04 UTC
Yes, that sounds pretty similar to what I found too.

We maybe should re-evaluate this when the control-flow-diamond
stuff comes to life, because it might be the case that that allows
us to have longer sequences of instructions between helper calls
and so there is more demand for registers.
Comment 4 Ivo Raisr 2017-10-19 18:20:57 UTC
Even after bug 384987 has been fixed, the situation did not improve much.

Running Memcheck on perf/bz2 [number of instructions]:
baseline: total: 44,936,961,809; reg alloc: 168,848,160; ratio: 15.3
patched:  total: 44,933,693,157; reg alloc: 177,659,739; ratio: 15.3

Running Memcheck on perf/tinycc:
baseline: total: 3,938,404,232; reg alloc: 121,447,931; ratio 16.2
patched:  total: 3,939,903,162; reg alloc: 127,930,047; ratio 16.2

With perf/bz2, there is only a very minor improvement. With perf/tinycc there is actually a very minor deterioration.

I am abandoning this bug for good.