405782 – "VEX temporary storage exhausted" when attempting to debug slic3r-pe

Bug 405782 - "VEX temporary storage exhausted" when attempting to debug slic3r-pe

Summary: "VEX temporary storage exhausted" when attempting to debug slic3r-pe

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	memcheck (show other bugs)
Version:	3.14.0
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-03-23 14:10 UTC by wavexx
Modified:	2019-04-01 12:49 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed In:

Attachments
valgrind trace (2.12 MB, application/gzip) 2019-03-25 19:05 UTC, wavexx	Details
valgrind trace (current master) (2.05 MB, application/gzip) 2019-03-30 15:02 UTC, wavexx	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description wavexx 2019-03-23 14:10:05 UTC

SUMMARY

When attempting to run memcheck (valgrind 3.14.0 on debian unstable) on slic3r-pe (https://github.com/prusa3d/Slic3r/) freshly compiled with gcc 8.3 I get the following:

> $ valgrind ./src/slic3r-gui
> ==19253== Memcheck, a memory error detector
> ==19253== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==19253== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
> ==19253== Command: ./src/slic3r-gui
> ==19253==
> VEX temporary storage exhausted.
> Pool = TEMP,  start 0x59640548 curr 0x59b04c90 end 0x59b05087 (size 5000000)

Under suggestion from Philippe Waroquiers I bumped the following buffers:

- N_TEMPORARY_BYTES 10x
- N_PERMANENT_BYTES 10x
- N_TMPBUF (and related hard-coded size assertions!) 10x

After that, memcheck runs correctly on the executable.
All three buffers had to be increased in order to run memcheck.

Please see the discussion thread on valgrind-users for further details:

  https://sourceforge.net/p/valgrind/mailman/message/36617210/

Slic3r is a large program depending on wx 3.1 (also self-compiled). It's not a program I wrote, I just encountered the problem.

SOFTWARE/OS VERSIONS

Linux 4.19 (debian unstable) on amd64.

Comment 1 Philippe Waroquiers 2019-03-24 06:51:00 UTC

Thanks for the bug.

Could you attach the VEX debug trace obtained doing the below ?
Thanks

-------------------------------------------------------------------
Use the unpatched valgrind (so as to reproduce the problem/crash).
run a first time:
  valgrind --trace-flags=11111111 <yourprogram>

This will output a bunch of lines such as:
...
==== SB 1789 (evchecks 8650) [tid 1] 0x4f833a7 free_mem+231 UNKNOWN_OBJECT+0x0
==== SB 1790 (evchecks 8651) [tid 1] 0x4f832ae free_slotinfo+110 UNKNOWN_OBJECT+0x0
...

Then rerun with
valgrind --trace-flags=11111111 --trace-notbelow=XXXXX <yourprogram>
where XXXXX is one or two numbers before the SB that causes the crash.

Comment 2 wavexx 2019-03-25 19:05:08 UTC

Created attachment 119032 [details]
valgrind trace

Comment 3 wavexx 2019-03-25 19:05:29 UTC

Done, sorry for the delay.

Comment 4 Philippe Waroquiers 2019-03-30 10:58:51 UTC

I have taken a quick look at the trace, and effectively,
the generated code is huge.
The code looks related to xmm/ymm registers and instructions.
In 3.15, Julian has made a bunch of improvements for the code
generation in this area.
See e.g. 
 git log 3af8e12b0d49dc87cd26258131ebd60c9b587c74..3b2f8bf69ea11f13357468d28cebc88d41be9199

Could you try to compile the last GIT version and see it it works better ?

Thanks

Philippe

Comment 5 wavexx 2019-03-30 11:55:07 UTC

Indeed, the current master can run it through without any tweak.
Is there anything you want me to try?

Comment 6 Philippe Waroquiers 2019-03-30 12:09:51 UTC

(In reply to wavexx from comment #5)
> Indeed, the current master can run it through without any tweak.
That is good news.
> Is there anything you want me to try?
I think the problem should be properly solved.

But to grasp a little bit better how much this was improved,
if you are courageous, it would be nice to redo the tracing with master
of the block that was giving the crash, so that we can evaluate the code improvement.

As the new version might not use exactly the same SB nr as the 3.14,
you should find the line that looks like:
==== SB 97263 (evchecks 68367534) [tid 1] 0x541a124 (anonymous namespace)::wxPNGImageData::DoLoadPNGFile(wxImage*, (anonymous namespace)::wxPNGInfoStruct&) [clone .constprop.45]+2228 /usr/local/stow/wxWidgets-3.1.2/lib/libwx_gtk3u_core-3.1.so.2.0.0+0x519124

and then do the trace with --trace-notbelow=XXXXX   --trace-notabove=YYYYY
and use XXXXX and YYYYY to have 1 or 2 SB before/after the [clone .constprop.45]+2228
address giving the problem.

Thanks
Philippe

Comment 7 wavexx 2019-03-30 15:02:56 UTC

Created attachment 119159 [details]
valgrind trace (current master)

Comment 8 wavexx 2019-03-30 15:04:35 UTC

I truncated the log manually after the "notabove" part to avoid sending megs of useless traces.

Comment 9 Philippe Waroquiers 2019-03-30 17:49:52 UTC

(In reply to wavexx from comment #7)
> Created attachment 119159 [details]
> valgrind trace (current master)

Thanks for the quick return.

Looking at the difference, the nr of front end temporaries has been divided by 3
(from 1854 to 594).
After instrumentation, divided by >4:
6503 -> 1495
and later on, the generated code is also much smaller.

So, Julian did a very good job :).

Comment 10 wavexx 2019-03-30 18:52:42 UTC

Do you still think the buffer sizes should be hard-coded though?

I know you can recompile and all, and theoretically this should never happen, but I do expect debugging tools to never fail on crappy input ;)

Comment 11 Philippe Waroquiers 2019-03-31 10:22:48 UTC

(In reply to wavexx from comment #10)
> Do you still think the buffer sizes should be hard-coded though?
> 
> I know you can recompile and all, and theoretically this should never
> happen, but I do expect debugging tools to never fail on crappy input ;)

There are advantages and disadvantages to the current approach:
As I understand, in terms of software layers, the VEX lib does not have any
dependencies to the valgrind memory management layer/address space manager.
Have memory sized at startup would break this.
Also, when these max are exceeded, this is really an (efficiency) bug.
Maybe it might also slightly impact the performance.

But for sure, it not nice to have valgrind crashing on valid programs.

Comment 12 Julian Seward 2019-04-01 12:49:28 UTC

Sorry to be slow getting to this, and thanks to Philippe for chasing it.

Yes .. it looks like the problem was caused by a very verbose translation
for the VPSHUFB instruction, applied to YMM registers.  As Philippe says,
that's something I fixed a few months back.

> Do you still think the buffer sizes should be hard-coded though?

A good question.  The VEX compilation pipeline is "protected" by
the fact that it will only include up to 50 instructions (with default
settings) into a superblock.  So even an infinitely long input basic
block will not cause infinite memory use in the JIT, since it will be
compiled in 50-instruction sections.  It's just unfortunate that the
translation of VPSHUFB in this case was so bad that the JIT overran the
fixed working space.