Bug 390893 - Crash while profiling my app
Summary: Crash while profiling my app
Status: RESOLVED UPSTREAM
Alias: None
Product: Heaptrack
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Milian Wolff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-22 09:27 UTC by jeremy.coulon.jrmc
Modified: 2018-02-23 08:50 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
thread apply all bt (57.77 KB, text/plain)
2018-02-22 16:59 UTC, jeremy.coulon.jrmc
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jeremy.coulon.jrmc 2018-02-22 09:27:09 UTC
Hello!

I have a complex (proprietary) and long running application that I am trying to profile with heaptrack. I am afraid I can't make a simpler open source example to reproduce the crash.

I often have the following crash after a few minutes of profiling. It appears from random places in my application. Thus I don't think the bug is on my side. I am profiling my app from the start (no live attaching).

I am running Ubuntu 16.04 LTS.

I compiled almost everything from source.

I tried different combinations of:
- heaptrack v1.0 tag
- heaptrack 1.0 branch
- heaptrack master branch
- libunwind debian/1.1-4 tag
- libunwind v1.2.1 tag
- libunwind v1.2-stable branch
- libunwind v1.3-stable branch but here I have compilation errors

With libunwind v1.2.1 and heaptrack master all unit tests from libunwind and heaptrack are OK.

<SIGABRT>
access_mem (/data/homes/jcoulon/git/libunwind/src/x86_64/Ginit.c:175) [/data/homes/jcoulon/lib/libunwind.so.8:0x3084]
_ULx86_64_step (/data/homes/jcoulon/git/libunwind/src/x86_64/Gstep.c:176) [/data/homes/jcoulon/lib/libunwind.so.8:0x3ebe]
trace_init_addr (/data/homes/jcoulon/git/libunwind/src/x86_64/Gtrace.c:248) [/data/homes/jcoulon/lib/libunwind.so.8:0x4937]
unw_backtrace (/data/homes/jcoulon/git/libunwind/src/mi/backtrace.c:69) [/data/homes/jcoulon/lib/libunwind.so.8:0x26a2]
Trace::fill(int) (/data/homes/jcoulon/git/heaptrack/src/track/trace.h:64) [/data/homes/jcoulon/lib/heaptrack/libheaptrack_preload.so:0x6b8c]
malloc (/data/homes/jcoulon/git/heaptrack/src/track/heaptrack_preload.cpp:178) [/data/homes/jcoulon/lib/heaptrack/libheaptrack_preload.so:0x3568]
operator new(unsigned long) (??:-1) [/usr/lib/x86_64-linux-gnu/libstdc++.so.6:0x8de78]

Please tell me if I can give you more debug information about this crash.

Thank you for your help.
Comment 1 Milian Wolff 2018-02-22 11:48:56 UTC
try the following:

heaptrack -d <yourapp>

this should start your app in GDB and then preload heaptrack and run it like normally. Once you get the crash, get the output from `thread apply all bt` and paste it here. You may want to sanitize it to hide information on your application, if needed. Then I can try to see if I can spot anything from that.

But without a way for me to reproduce it, I'm afraid there isn't a high chance of getting this fixed. If it only happens with your proprietary application, you will have to dig in and try to fix it yourself if the backtrace isn't enough...
Comment 2 jeremy.coulon.jrmc 2018-02-22 16:59:51 UTC
Created attachment 110902 [details]
thread apply all bt
Comment 3 jeremy.coulon.jrmc 2018-02-22 17:02:12 UTC
I attached the result of 'thread apply all bt'.
I have 70 threads and multiple threads are doing malloc/free at the same time.
The problematic thread is thread 56.
Comment 4 jeremy.coulon.jrmc 2018-02-22 17:05:34 UTC
Gdb was executed on:
* heaptrack version v1.0.0-119-g6e31841
* libunwind version v1.2-3-gac02808
Comment 5 Milian Wolff 2018-02-22 20:22:56 UTC
A java application - interesting, never used heaptrack on that. The crash itself happens within libunwind:

Thread 56 (Thread 0x7ffeff661700 (LWP 5926)):
#0  0x00007ffff71d9428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff71db02a in __GI_abort () at abort.c:89
#2  0x00007ffff55eeed5 in os::abort(bool) () from /path/to/mysdk/build/java/1.8.152-0/amd64-linux/lib/amd64/server/libjvm.so
#3  0x00007ffff5792a33 in VMError::report_and_die() () from /path/to/mysdk/build/java/1.8.152-0/amd64-linux/lib/amd64/server/libjvm.so
#4  0x00007ffff55f4def in JVM_handle_linux_signal () from /path/to/mysdk/build/java/1.8.152-0/amd64-linux/lib/amd64/server/libjvm.so
#5  0x00007ffff55eaea3 in signalHandler(int, siginfo*, void*) () from /path/to/mysdk/build/java/1.8.152-0/amd64-linux/lib/amd64/server/libjvm.so
#6  0x0000xxxxxxxxxxxx in handleCustom (sc_=<optimized out>, si=<optimized out>, code=11, handlerCode=11) at ??
#7  mySignalHandler (code=11, si=<optimized out>, sc_=<optimized out>) at ??
#8  <signal handler called>
#9  access_mem (as=<optimized out>, addr=29930553589, val=0x7ffeff65eb90, write=<optimized out>, arg=<optimized out>) at x86_64/Ginit.c:175
#10 0x00007ffff6f8829c in dwarf_get (c=0x7ffeff65f090, c=0x7ffeff65f090, val=0x7ffeff65eb90, loc=...) at ../include/tdep-x86_64/libunwind_i.h:167
#11 _ULx86_64_step (cursor=cursor@entry=0x7ffeff65f090) at x86_64/Gstep.c:166
#12 0x00007ffff6f8942c in trace_init_addr (rsp=<optimized out>, rbp=<optimized out>, rip=<optimized out>, cfa=31901497176, cursor=0x7ffeff65f090, f=0x7fff4bfdc940) at x86_64/Gtrace.c:248
#13 trace_lookup (rsp=<optimized out>, rbp=<optimized out>, rip=<optimized out>, cfa=31901497176, cache=<optimized out>, cursor=0x7ffeff65f090) at x86_64/Gtrace.c:330
#14 _ULx86_64_tdep_trace (cursor=cursor@entry=0x7ffeff65f090, buffer=buffer@entry=0x7ffeff65f928, size=size@entry=0x7ffeff65ecd4) at x86_64/Gtrace.c:447
#15 0x00007ffff6f85db2 in unw_backtrace (buffer=0x7ffeff65f928, size=64) at mi/backtrace.c:69
#16 0x00007ffff7bc19e0 in Trace::fill (this=0x7ffeff65f920, skip=3) at /home/jcoulon/git/heaptrack/src/track/trace.h:61
#17 0x00007ffff7bbfbd0 in heaptrack_malloc (ptr=0x7ffec406b160, size=8) at /home/jcoulon/git/heaptrack/src/track/libheaptrack.cpp:638
#18 0x00007ffff7bbd9dd in malloc (size=8) at /home/jcoulon/git/heaptrack/src/track/heaptrack_preload.cpp:176
#19 0x00007ffff6c8ee78 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#20 0x00007fff3a40f08e in __gnu_cxx::new_allocator<unsigned long>::allocate (this=<optimized out>, __n=<optimized out>) at /usr/include/c++/4.9/ext/new_allocator.h:104
#21 0x0000xxxxxxxxxxxx in ?? ()

Probably it tries to access invalid memory, triggering a signal which then leads to the crash. To fix this, you'll have to fix libunwind, which is going to be tough. First, you'll need to figure out whom to blame - is it a bug in libunwind? Or is the DWARF data corrupt, misleading it? Can the latter be handled somehow?

One way or another, this isn't a bug within heaptrack itself. It definitely makes the tool unusable for you, but it's nothing that can be workarounded from within heaptrack - it has to be fixed upstream (either libunwind or in the DWARF emitter).
Comment 6 jeremy.coulon.jrmc 2018-02-23 08:50:45 UTC
Well I think using heaptrack for a pure Java application wouldn't be helpful. However my application is a mix of Java and C++ with JNI. I already successfully profiled some smaller scenarii with heaptrack and found some memory misuse in the C++ part of my application.

I will try to contact libunwind team for my crash.

Thanks!

NB: I read somewhere that it is possible to print callstacks mixing java and native code. It may be a good enhancement to heaptrack even if it is not critical for me at the moment.