If a program has PID=1 and is using --log-file=foo.%p, its logging output gets written to "foo.1". If it forks, then the child's logging output should start going into a different file (eg. "foo.2") to the parent's logging output. Currently it doesn't. Furthermore, if we implemented this, if the child then does an 'exec', the child log file ("foo.2") will be overwritten by the newly exec'd process. This probably isn't a big deal, since the gap between fork and exec in a child is typically very small, so there's not much potential for lost information.
*** Bug 110616 has been marked as a duplicate of this bug. ***
This is very similar to bug 134316.
Chrome is now running into this.
Dan, can you explain the use case that causes Chrome to hit this? Thanks.
Sure. You know how Firefox for Linux explodes when the linux distro updates the files out from under a running copy of Firefox? Chrome avoids that by never opening or exec'ing a file after initial startup. Instead, chrome opens files once and keeps them open. For executables, that means it uses fork() and not fork()/exec(). We can work around this by going back to fork()/exec() mode, and that'll get us 99% of what we want, but it'd be just be nice if valgrind handled this situation better.
See also http://code.google.com/p/chromium/wiki/LinuxZygote
I wasn't able to work around all the fork() calls, so now I'm looking for another solution. I could use --log-fd=7 and then do a dup2 onto 7 after each fork. That's a lot of work, though. Is there a way to retrieve the current log file descriptor?
(In reply to comment #7) > > Is there a way to retrieve the current log file descriptor? From within the client program? No.
Created attachment 35027 [details] Demo of how to find the log file's fd I figured out how to find the log file's fd, but it doesn't help, as valgrind prevents modifying it with dup2(). Next idea, anyone?
Created attachment 35047 [details] Prototype patch Looks like just fixing the bug, at least for my case, is easier than working around it. Here's a trivial implementation of the fix. It needs lots of love still, but shows that it ought to be very doable.
Created attachment 35123 [details] A little bit improved version of Dan's patch.
Timur, there seem to be unrelated hunks in that patch?
Created attachment 35150 [details] Prototype patch 3 The previous one needed the "the_iicii" global variable to be declared earlier, which resulted in 2 big hunks. This was an overkill, I guess.
See also bug 200544 for the next problem with supporting fork() properly.
Created attachment 35510 [details] Prototype patch 4 This patch is an extension of the previous one - it closes the log gracefully if the client process does exec. However, this patch doesn't apply since r10465...
Created attachment 36126 [details] Prototype patch 5 Re-written the patch for r10771
Created attachment 36127 [details] Prototype patch 6 Fix for a small mistake in patch #5
We've been successfully using a similar patch for Chromium for more than a year. You can see the diff with the instructions given here: http://code.google.com/p/valgrind-variant/wiki/HowTo
I've encountered this problem (%p format specifier basically not working for forked children without exec) as well. Current Valgrind behaviour contradicts the manual: "%p is replaced with the current process ID. This is very useful for program that invoke multiple processes. WARNING: If you use --trace-children=yes and your program invokes multiple processes OR your program forks without calling exec afterwards, and you don't use this specifier (or the %q specifier below), the Valgrind output from all those processes will go into one file, possibly jumbled up, and possibly incomplete." So basically we need to re-evaluate --[log|xml]-file in the after-fork handler. I'll have a look at the patch provided and see if it fits in the SVN trunk.
Created attachment 103030 [details] proposed patch (v7) Proposed patch, loosely based on previous prototype v6 by Timur. Output sink initialization is moved to m_libcproc.c, along with preamble printing. This resulted in some code moving and refactoring. Tested on Linux and Solaris, with various --log-file, --log-fd, --xml-file, and --xml-fd options. Log/XML file is properly split for the forked child. Regression testing ok.
Created attachment 103060 [details] proposed patch (v8) fixed two minor problems
Fixed in SVN r16200.
Follow up commit in SVN r16210. Somehow I forgot to include the documentation changes previously...
*** Bug 253308 has been marked as a duplicate of this bug. ***