| Summary: | VEX AMD64 backend should list more real registers as available for the register allocator | ||
|---|---|---|---|
| Product: | [Developer tools] valgrind | Reporter: | Ivo Raisr <ivosh> |
| Component: | vex | Assignee: | Ivo Raisr <ivosh> |
| Status: | RESOLVED INTENTIONAL | ||
| Severity: | normal | CC: | ivosh, jseward |
| Priority: | NOR | ||
| Version First Reported In: | 3.14 SVN | ||
| Target Milestone: | --- | ||
| Platform: | Compiled Sources | ||
| OS: | Linux | ||
| See Also: | https://bugs.kde.org/show_bug.cgi?id=384584 | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: | patch | ||
|
Description
Ivo Raisr
2017-09-13 20:41:38 UTC
Created attachment 107840 [details]
patch
Unfortunately performance measurements do not confirm this as a good idea. Measuring Memcheck on perf/bz2, instruction count: v3 baseline: 45,110 M total; 168 M register allocator v3 patched: 45,123 M total; 176 M register allocator v2 baseline: 45,190 M total; 209 M register allocator v2 patched: 45,266 M total; 220 M register allocator Measuring Memcheck on tinycc, instruction count: v3 baseline: 4,155,471 k total; 86,438 k register allocator v3 patched: 4,155,207 k total; 91,193 k register allocator I'd conclude that although two more registers are available to the register allocator, it does not help because: - they are caller saved registers - register allocator needs to iterate over more registers Yes, that sounds pretty similar to what I found too. We maybe should re-evaluate this when the control-flow-diamond stuff comes to life, because it might be the case that that allows us to have longer sequences of instructions between helper calls and so there is more demand for registers. Even after bug 384987 has been fixed, the situation did not improve much. Running Memcheck on perf/bz2 [number of instructions]: baseline: total: 44,936,961,809; reg alloc: 168,848,160; ratio: 15.3 patched: total: 44,933,693,157; reg alloc: 177,659,739; ratio: 15.3 Running Memcheck on perf/tinycc: baseline: total: 3,938,404,232; reg alloc: 121,447,931; ratio 16.2 patched: total: 3,939,903,162; reg alloc: 127,930,047; ratio 16.2 With perf/bz2, there is only a very minor improvement. With perf/tinycc there is actually a very minor deterioration. I am abandoning this bug for good. |