Bug 393387 - support AppImage profiling
Summary: support AppImage profiling
Status: RESOLVED FIXED
Alias: None
Product: Heaptrack
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Milian Wolff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-22 08:45 UTC by Milian Wolff
Modified: 2018-04-27 11:19 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Milian Wolff 2018-04-22 08:45:09 UTC
See also: https://bugs.kde.org/show_bug.cgi?id=379551

Download the KDevelop AppImage, extract it, edit AppRun and prepend "heaptrack" to the last line that launches KDevelop, then try to run it:

```
/tmp/KDevelop-5.2.1-x86_64$ ./AppRun
heaptrack output will be written to "/home/milian/heaptrack.kdevelop.16224.zst"
...
failed to parse line: - 18t 7fd97fe78381 25ff1
failed to parse line: t 7fd97f8fa088 27b2t 7fd97fd970c2 27cdf
...
Heaptrack finished! Now run the following to investigate the data:

  heaptrack --analyze "/home/milian/heaptrack.kdevelop.16224.zst"
```

And then:

```
$ heaptrack /home/milian/heaptrack.kdevelop.16224.zst
     0.000 warning: unknown[unknown:0]: Trace recursion detected - corrupt data file?
```
Comment 1 Milian Wolff 2018-04-23 08:21:26 UTC
it's still unclear to me how we can mess up the output of the data file. Adding a flush after writing an index helps, but I don't understand why that would be required - there seems to be something wrong with our locking...

Paired with that, we can easily run into situations where we stop heaptrack prematurely when we encounter a write error that should be handled, such as:

heaptrack debug(2) [8859:8859]@5830 dlopen_notify_callback: /tmp/KDevelop-5.2.1-x86_64/usr/lib/libxcb-dri2.so.0 7fd27a93d000
heaptrack debug(1) [8885:8885]@5830 child_fork()
heaptrack debug(1) [8859:8859]@5831 write error 4/Interrupted system call
#1  0x00007fd298297f1e sp=0x00007ffe33957bd0 _ZN12_GLOBAL__N_19HeapTrack10writeErrorEv + 0x4c
#2  0x00007fd298297cff sp=0x00007ffe33957c00 _ZN12_GLOBAL__N_19HeapTrack24dl_iterate_phdr_callbackEP12dl_phdr_infomPv + 0x13c
#3  0x00007fd29123fc81 sp=0x00007ffe33957c50 dl_iterate_phdr + 0x171
#4  0x00007fd298297ec1 sp=0x00007ffe33957d00 _ZN12_GLOBAL__N_19HeapTrack17updateModuleCacheEv + 0xa5
#5  0x00007fd298297ae0 sp=0x00007ffe33957d20 _ZN12_GLOBAL__N_19HeapTrack12handleMallocEPvmRK5Trace + 0x54
#6  0x00007fd298298805 sp=0x00007ffe33957d60 heaptrack_malloc + 0xef
#7  0x00007fd298295f9b sp=0x00007ffe33957fb0 malloc + 0x5f
#8  0x00007fd291ab6089 sp=0x00007ffe33957fe0 _Znwm + 0x19
#9  0x00007fd29205cd42 sp=0x00007ffe33957ff0 _ZN7QObjectC1EPS_ + 0x22
#10 0x00007fd292023374 sp=0x00007ffe33958010 _ZN8QLibraryC2ERK7QStringP7QObject + 0x14
#11 0x00007fd292ef53d0 sp=0x00007ffe33958030 _ZN13KPluginLoader4loadEv + 0x60
#12 0x00007fd292ef54aa sp=0x00007ffe33958090 _ZN13KPluginLoader8instanceEv + 0xa
#13 0x00007fd292ef54ec sp=0x00007ffe339580a0 _ZN13KPluginLoader7factoryEv + 0x1c
#14 0x00007fd297d4eece sp=0x00007ffe33958140 _ZN8KDevelop16PluginController18loadPluginInternalERK7QString + 0x58e
#15 0x00007fd297d50396 sp=0x00007ffe339582a0 _ZN8KDevelop16PluginController10initializeEv + 0x8d6
#16 0x00007fd297d5ab85 sp=0x00007ffe339583f0 _ZN8KDevelop11CorePrivate10initializeENS_4Core5SetupERK7QString + 0xb95
#17 0x00007fd297d5c03d sp=0x00007ffe33958490 _ZN8KDevelop4Core10initializeENS0_5SetupERK7QString + 0x5d
#18 0x000000000040b472 sp=0x00007ffe339584c0 main + 0x5182
#19 0x00007fd29112ff4a sp=0x00007ffe339587d0 __libc_start_main + 0xea
#20 0x000000000040caed sp=0x00007ffe33958890 _start + 0x29
heaptrack debug(1) [8859:8859]@5858 shutdown()
heaptrack debug(1) [8859:8859]@5858 destroying LockedData
heaptrack debug(1) [8859:8859]@5858 done destroying LockedData
heaptrack debug(1) [8859:8859]@5885 shutdown() done
Comment 2 Milian Wolff 2018-04-23 10:30:09 UTC
Git commit 77f1e068b292dedc3668df06f2eea705e1a28527 by Milian Wolff.
Committed on 23/04/2018 at 10:27.
Pushed by mwolff into branch 'master'.

Try to handle EINTR when writing data

Hopefully fixes issues such as this one:

heaptrack debug(2) [8859:8859]@5830 dlopen_notify_callback: /tmp/KDevelop-5.2.1-x86_64/usr/lib/libxcb-dri2.so.0 7fd27a93d000
heaptrack debug(1) [8885:8885]@5830 child_fork()
heaptrack debug(1) [8859:8859]@5831 write error 4/Interrupted system call
#1  0x00007fd298297f1e sp=0x00007ffe33957bd0 _ZN12_GLOBAL__N_19HeapTrack10writeErrorEv + 0x4c
#2  0x00007fd298297cff sp=0x00007ffe33957c00 _ZN12_GLOBAL__N_19HeapTrack24dl_iterate_phdr_callbackEP12dl_phdr_infomPv + 0x13c
#3  0x00007fd29123fc81 sp=0x00007ffe33957c50 dl_iterate_phdr + 0x171
#4  0x00007fd298297ec1 sp=0x00007ffe33957d00 _ZN12_GLOBAL__N_19HeapTrack17updateModuleCacheEv + 0xa5
#5  0x00007fd298297ae0 sp=0x00007ffe33957d20 _ZN12_GLOBAL__N_19HeapTrack12handleMallocEPvmRK5Trace + 0x54
#6  0x00007fd298298805 sp=0x00007ffe33957d60 heaptrack_malloc + 0xef
#7  0x00007fd298295f9b sp=0x00007ffe33957fb0 malloc + 0x5f
#8  0x00007fd291ab6089 sp=0x00007ffe33957fe0 _Znwm + 0x19
#9  0x00007fd29205cd42 sp=0x00007ffe33957ff0 _ZN7QObjectC1EPS_ + 0x22
#10 0x00007fd292023374 sp=0x00007ffe33958010 _ZN8QLibraryC2ERK7QStringP7QObject + 0x14
#11 0x00007fd292ef53d0 sp=0x00007ffe33958030 _ZN13KPluginLoader4loadEv + 0x60
#12 0x00007fd292ef54aa sp=0x00007ffe33958090 _ZN13KPluginLoader8instanceEv + 0xa
#13 0x00007fd292ef54ec sp=0x00007ffe339580a0 _ZN13KPluginLoader7factoryEv + 0x1c
#14 0x00007fd297d4eece sp=0x00007ffe33958140 _ZN8KDevelop16PluginController18loadPluginInternalERK7QString + 0x58e
#15 0x00007fd297d50396 sp=0x00007ffe339582a0 _ZN8KDevelop16PluginController10initializeEv + 0x8d6
#16 0x00007fd297d5ab85 sp=0x00007ffe339583f0 _ZN8KDevelop11CorePrivate10initializeENS_4Core5SetupERK7QString + 0xb95
#17 0x00007fd297d5c03d sp=0x00007ffe33958490 _ZN8KDevelop4Core10initializeENS0_5SetupERK7QString + 0x5d
#18 0x000000000040b472 sp=0x00007ffe339584c0 main + 0x5182
#19 0x00007fd29112ff4a sp=0x00007ffe339587d0 __libc_start_main + 0xea
#20 0x000000000040caed sp=0x00007ffe33958890 _start + 0x29
heaptrack debug(1) [8859:8859]@5858 shutdown()
heaptrack debug(1) [8859:8859]@5858 destroying LockedData
heaptrack debug(1) [8859:8859]@5858 done destroying LockedData
heaptrack debug(1) [8859:8859]@5885 shutdown() done

M  +39   -31   src/track/libheaptrack.cpp

https://commits.kde.org/heaptrack/77f1e068b292dedc3668df06f2eea705e1a28527
Comment 3 Milian Wolff 2018-04-24 18:39:20 UTC
The new test I wrote is ASAN, UBSAN and TSAN clean... I'm still completely clueless as to what is going on here!
Comment 4 Milian Wolff 2018-04-24 18:56:10 UTC
Neither strace, GDB or rr work. But `perf trace` works! And that's actually quite interesting:

perf trace --no-syscalls record -m 16M  -e signal:signal_deliver -e signal:signal_generate --call-graph dwarf ./AppRun -s test

a SIGCHLD is received while we are writing:

kdevelop  8054 [000] 244815.920388:  signal:signal_deliver: sig=17 errno=0 code=1 sa_handler=7fdf2fd03570 sa_flags=4000005
        ffffffff8508c817 get_signal ([kernel.kallsyms])
                   e747b __GI___libc_write (/usr/lib/libc-2.26.so)
                   791ec _IO_new_file_write (inlined)
                   784ce new_do_write (/usr/lib/libc-2.26.so)
                   7a3e8 _IO_new_do_write (inlined)
                   798e6 _IO_new_file_xsputn (inlined)
                   4cf7b _IO_vfprintf_internal (inlined)
                  104d15 ___fprintf_chk (inlined)
                    7fd2 fprintf (inlined)
                    7fd2 writeLine<long unsigned int, unsigned int, long unsigned int> (inlined)
                    7fd2 handleMalloc (inlined)
                    7fd2 heaptrack_malloc (/home/milian/projects/compiled/other/lib/heaptrack/libheaptrack_preload.so)
                    31d6 malloc (/home/milian/projects/compiled/other/lib/heaptrack/libheaptrack_preload.so)
                   8d088 operator new (/usr/lib/libstdc++.so.6.0.24)
                  220553 QAbstractItemModel::QAbstractItemModel (/tmp/KDevelop-5.2.1-x86_64/usr/lib/libQt5Core.so.5)
...

then shortly after we get the SIGPIPE when heaptrack_interpret quits after receiving the bogus data:

kdevelop  8054 [000] 244815.967719: signal:signal_generate: sig=13 errno=0 code=0 comm=kdevelop pid=8054 grp=0 res=1
        ffffffff8508ab72 __send_signal ([kernel.kallsyms])
        ffffffff8508ab72 __send_signal ([kernel.kallsyms])
                   e747b __GI___libc_write (/usr/lib/libc-2.26.so)
                   791ec _IO_new_file_write (inlined)
                   784ce new_do_write (/usr/lib/libc-2.26.so)
                   7a3e8 _IO_new_do_write (inlined)
                   798e6 _IO_new_file_xsputn (inlined)
                   4e57d _IO_vfprintf_internal (inlined)
                  104d15 ___fprintf_chk (inlined)
                    7f5a fprintf (inlined)
                    7f5a writeLine<long unsigned int, unsigned int> (inlined)
                    7f5a operator() (inlined)
                    7f5a index<(anonymous namespace)::HeapTrack::handleMalloc(void*, size_t, const Trace&)::<lambda(uintptr_t, uint32_t)> > (inlined)
                    7f5a handleMalloc (inlined)
                    7f5a heaptrack_malloc (/home/milian/projects/compiled/other/lib/heaptrack/libheaptrack_preload.so)
                    31d6 malloc (/home/milian/projects/compiled/other/lib/heaptrack/libheaptrack_preload.so)
                   8d088 operator new (/usr/lib/libstdc++.so.6.0.24)
                   f8a83 QHashData::rehash (/tmp/KDevelop-5.2.1-x86_64/usr/lib/libQt5Core.so.5)
...

could that be the reason here?
Comment 5 Milian Wolff 2018-04-27 11:19:04 UTC
Git commit 95a17c6a407a50b4da66d97521adf94c2a1ae219 by Milian Wolff.
Committed on 27/04/2018 at 11:18.
Pushed by mwolff into branch 'master'.

Do not use stdio for buffered I/O

Instead, implement a custom buffer scheme that uses snprintf
and write directly. This way, we have the write syscall under
total control and can properly handle EINTR. See also:

http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/2006-05/msg00314.html

This seems to fix the issue where the data file is corrupted. It
seems to be due to a SIGCHLD signal being received while we were
inside an fprintf syscall:

perf trace --no-syscalls record -m 16M  \
	-e signal:signal_deliver \
	-e signal:signal_generate \
        -e sched:sched_process_exit \
	--call-graph dwarf \
	./AppRun -s test
...
  # child process of kdevelop exits:
  2443.428 sched:sched_process_exit:comm=docker pid=15924 prio=120
  # SIGCHILD is handled by a thread that's inside fprintf:
  2444.617 signal:signal_deliver:sig=17 errno=0 code=1 sa_handler=7fa121163570 sa_flags=4000005
                                       get_signal ([kernel.kallsyms])
                                       __GI___libc_write (/usr/lib/libc-2.26.so)
                                       _IO_new_file_write (inlined)
                                       new_do_write (/usr/lib/libc-2.26.so)
                                       _IO_new_do_write (inlined)
                                       _IO_new_file_xsputn (inlined)
                                       _IO_vfprintf_internal (inlined)
                                       ___fprintf_chk (inlined)
                                       fprintf (inlined)
                                       writeLine<long unsigned int, unsigned int, long unsigned int> (inlined)
                                       handleMalloc (inlined)
                                       ...

  # eventually heaptrack_interpret receives partially written data:

  failed to parse line: t 7fa11060b9- 30a09e0
  heaptrack stats:
        allocations:            534879
        leaked allocations:     150033
        temporary allocations:  78765

  # and then exists:
  3580.135 sched:sched_process_exit:comm=heaptrack_inter pid=15913 prio=120

  # leading to a SIGPIPE:
  3580.628 signal:signal_generate:sig=13 errno=0 code=0 comm=kdevelop pid=15915 grp=0 res=1
                                       __send_signal ([kernel.kallsyms])
                                       __send_signal ([kernel.kallsyms])
                                       __GI___libc_write (/usr/lib/libc-2.26.so)
                                       _IO_new_file_write (inlined)
                                       new_do_write (/usr/lib/libc-2.26.so)
                                       _IO_new_do_write (inlined)
                                       _IO_new_file_xsputn (inlined)
                                       _IO_vfprintf_internal (inlined)
                                       ___fprintf_chk (inlined)
                                       fprintf (inlined)
                                       writeLine<char const*, long unsigned int> (inlined)

It seems to me as if this is the explanation for this issue.
Ultimately it boils down to fprintf not being save for use when
you may receive sigchild leading to EINTR.

M  +9    -1    src/track/heaptrack_inject.cpp
M  +113  -86   src/track/libheaptrack.cpp
M  +1    -1    src/track/libheaptrack.h
M  +23   -20   tests/auto/tst_libheaptrack.cpp

https://commits.kde.org/heaptrack/95a17c6a407a50b4da66d97521adf94c2a1ae219