Bug 162848 - --log-file output isn't split when a program forks
Summary: --log-file output isn't split when a program forks
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.4 SVN
Platform: Unlisted Binaries Linux
: NOR normal
Target Milestone: ---
Assignee: Ivo Raisr
URL:
Keywords:
: 110616 253308 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-05-30 01:26 UTC by Nicholas Nethercote
Modified: 2024-01-12 21:00 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Demo of how to find the log file's fd (2.07 KB, text/x-csrc)
2009-07-04 04:26 UTC, Dan Kegel
Details
Prototype patch (2.38 KB, patch)
2009-07-04 20:49 UTC, Dan Kegel
Details
A little bit improved version of Dan's patch. (5.85 KB, text/plain)
2009-07-07 16:38 UTC, Timur Iskhodzhanov
Details
Prototype patch 3 (3.89 KB, patch)
2009-07-08 13:19 UTC, Timur Iskhodzhanov
Details
Prototype patch 4 (4.80 KB, patch)
2009-07-21 12:35 UTC, Timur Iskhodzhanov
Details
Prototype patch 5 (14.98 KB, text/plain)
2009-08-13 11:27 UTC, Timur Iskhodzhanov
Details
Prototype patch 6 (15.13 KB, patch)
2009-08-13 11:46 UTC, Timur Iskhodzhanov
Details
proposed patch (v7) (59.08 KB, patch)
2016-12-28 06:30 UTC, Ivo Raisr
Details
proposed patch (v8) (59.20 KB, patch)
2016-12-29 13:50 UTC, Ivo Raisr
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nicholas Nethercote 2008-05-30 01:26:39 UTC
If a program has PID=1 and is using --log-file=foo.%p, its logging output gets
written to "foo.1".  If it forks, then the child's logging output should start
going into a different file (eg. "foo.2") to the parent's logging output. 
Currently it doesn't.

Furthermore, if we implemented this, if the child then does an 'exec', the child
log file ("foo.2") will be overwritten by the newly exec'd process.  This
probably isn't a big deal, since the gap between fork and exec in a child is
typically very small, so there's not much potential for lost information.
Comment 1 Nicholas Nethercote 2009-07-01 09:31:06 UTC
*** Bug 110616 has been marked as a duplicate of this bug. ***
Comment 2 Nicholas Nethercote 2009-07-01 09:34:42 UTC
This is very similar to bug 134316.
Comment 3 Dan Kegel 2009-07-02 20:24:22 UTC
Chrome is now running into this.
Comment 4 Nicholas Nethercote 2009-07-02 22:10:58 UTC
Dan, can you explain the use case that causes Chrome to hit this?  Thanks.
Comment 5 Dan Kegel 2009-07-02 22:44:26 UTC
Sure.  You know how Firefox for Linux explodes when the linux distro
updates the files out from under a running copy of Firefox?
Chrome avoids that by never opening or exec'ing a file after initial
startup.  Instead, chrome opens files once and keeps them open.
For executables, that means it uses fork() and not fork()/exec().

We can work around this by going back to fork()/exec() mode,
and that'll get us 99% of what we want,
but it'd be just be nice if valgrind handled this situation better.
Comment 6 Dan Kegel 2009-07-02 22:45:32 UTC
See also http://code.google.com/p/chromium/wiki/LinuxZygote
Comment 7 Dan Kegel 2009-07-03 01:12:42 UTC
I wasn't able to work around all the fork() calls,
so now I'm looking for another solution.

I could use --log-fd=7 and then do a dup2 onto 7
after each fork.  That's a lot of work, though.
Is there a way to retrieve the current log file descriptor?
Comment 8 Nicholas Nethercote 2009-07-03 01:38:14 UTC
(In reply to comment #7)
>
> Is there a way to retrieve the current log file descriptor?

From within the client program?  No.
Comment 9 Dan Kegel 2009-07-04 04:26:46 UTC
Created attachment 35027 [details]
Demo of how to find the log file's fd

I figured out how to find the log file's fd, but it
doesn't help, as valgrind prevents modifying it
with dup2().

Next idea, anyone?
Comment 10 Dan Kegel 2009-07-04 20:49:50 UTC
Created attachment 35047 [details]
Prototype patch

Looks like just fixing the bug, at least for my case,
is easier than working around it.  Here's a trivial
implementation of the fix.  It needs lots of love
still, but shows that it ought to be very doable.
Comment 11 Timur Iskhodzhanov 2009-07-07 16:38:04 UTC
Created attachment 35123 [details]
A little bit improved version of Dan's patch.
Comment 12 Dan Kegel 2009-07-07 17:34:36 UTC
Timur, there seem to be unrelated hunks in that patch?
Comment 13 Timur Iskhodzhanov 2009-07-08 13:19:02 UTC
Created attachment 35150 [details]
Prototype patch 3

The previous one needed the "the_iicii" global variable to be declared earlier, which resulted in 2 big hunks. This was an overkill, I guess.
Comment 14 Dan Kegel 2009-07-17 21:00:10 UTC
See also bug 200544 for the next problem with supporting fork() properly.
Comment 15 Timur Iskhodzhanov 2009-07-21 12:35:27 UTC
Created attachment 35510 [details]
Prototype patch 4

This patch is an extension of the previous one - it closes the log gracefully if the client process does exec.

However, this patch doesn't apply since r10465...
Comment 16 Timur Iskhodzhanov 2009-08-13 11:27:47 UTC
Created attachment 36126 [details]
Prototype patch 5

Re-written the patch for r10771
Comment 17 Timur Iskhodzhanov 2009-08-13 11:46:40 UTC
Created attachment 36127 [details]
Prototype patch 6

Fix for a small mistake in patch #5
Comment 18 Timur Iskhodzhanov 2011-02-15 18:05:27 UTC
We've been successfully using a similar patch for Chromium for more than a year.

You can see the diff with the instructions given here:
http://code.google.com/p/valgrind-variant/wiki/HowTo
Comment 19 Ivo Raisr 2016-12-27 10:50:10 UTC
I've encountered this problem (%p format specifier basically not working for forked children without exec) as well. Current Valgrind behaviour contradicts the manual:

"%p is replaced with the current process ID. This is very useful for program that invoke multiple processes. WARNING: If you use --trace-children=yes and your program invokes multiple processes OR your program forks without calling exec afterwards, and you don't use this specifier (or the %q specifier below), the Valgrind output from all those processes will go into one file, possibly jumbled up, and possibly incomplete."

So basically we need to re-evaluate --[log|xml]-file in the after-fork handler.
I'll have a look at the patch provided and see if it fits in the SVN trunk.
Comment 20 Ivo Raisr 2016-12-28 06:30:42 UTC
Created attachment 103030 [details]
proposed patch (v7)

Proposed patch, loosely based on previous prototype v6 by Timur.

Output sink initialization is moved to m_libcproc.c, along with preamble printing. This resulted in some code moving and refactoring.

Tested on Linux and Solaris, with various --log-file, --log-fd, --xml-file,
and --xml-fd options. Log/XML file is properly split for the forked child.
Regression testing ok.
Comment 21 Ivo Raisr 2016-12-29 13:50:07 UTC
Created attachment 103060 [details]
proposed patch (v8)

fixed two minor problems
Comment 22 Ivo Raisr 2017-01-12 11:28:40 UTC
Fixed in SVN r16200.
Comment 23 Ivo Raisr 2017-01-25 07:23:10 UTC
Follow up commit in SVN r16210.
Somehow I forgot to include the documentation changes previously...
Comment 24 Mark Wielaard 2024-01-12 13:59:22 UTC
*** Bug 253308 has been marked as a duplicate of this bug. ***