Bug 373574 - valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed.
Summary: valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '...
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-12 13:31 UTC by kujawa
Modified: 2018-02-24 17:28 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
code that reproduces described behaviour (198 bytes, text/x-c++src)
2016-12-12 13:31 UTC, kujawa
Details

Note You need to log in before you can comment on or make changes to this bug.
Description kujawa 2016-12-12 13:31:33 UTC
Created attachment 102745 [details]
code that reproduces described behaviour

When running valgrind with --sim-hints=fuse-compatible we hit into given assertion.

We checked that in our case sci->flags & SfMayBlock is true.

trace from our original execution (with valgrind with --trace-syscalls=yes):

SYSCALL[28094,1](12) sys_brk ( 0x0 ) --> [pre-success] Success(0x4000000)
--28094-- REDIR: 0x37458176d0 (ld-linux-x86-64.so.2:strlen) redirected to 0x380550c1 (vgPlain_amd64_linux_REDIR_FOR_strlen)
SYSCALL[28094,1](63) sys_newuname ( 0xffefffb10 )[sync] --> Success(0x0)
--28094-- REDIR: 0x37458174e0 (ld-linux-x86-64.so.2:index) redirected to 0x380550db (vgPlain_amd64_linux_REDIR_FOR_index)
SYSCALL[28094,1](89) sys_readlink ( 0x374581b667(/proc/self/exe), 0xffeffec10, 4096 ) --> [pre-success] Success(0x52)
valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed.

it shows that "sys_readlink" is suspected blocking call. I attach simple code that reproduces problem when executed by command:
valgrind --tool=memcheck -v --trace-syscalls=yes --sim-hints=fuse-compatible ./sfMayBlockExample.exe
Comment 1 kujawa 2017-02-24 11:34:26 UTC
Hello valgrind people,

what is the status of this bug? Can we expect someone will look at this (or maybe also fix this)?

Radek
Comment 2 kujawa 2018-02-19 13:29:30 UTC
Hi guys,
is there a chance that someone will look at this?

Radek
Comment 3 Philippe Waroquiers 2018-02-24 17:28:45 UTC
I took a quick look at this problem.
I have to admit that I do not too much understand the logic used
to mark some syscall as SfMayBlock,
or mark them as  SfMayBlock only if --sim-hints=fuse-compatible
or not marking them.
For example, sys_rename is FUSE_COMPATIBLE_MAY_BLOCK();
while sys_renameat is not marked.
I am guessing that depending on the fs (in particular, network file systems),
all/most file system operations might potentially block. See e.g. bug 278057
comment 2).

That being said, I think the problem you have is because PRE(sys_readlink)
is doing the complete work of the syscall to possibly have some
special logic needed to handle some /proc filenames.

When using fuse-compatible, the syscall is marked as SfMayBlock but that
is not expected when the PRE(sys_*) is doing all the work : effectively,
if the PRE(sys_*) is doing all the work, how can the syscall block later on ?

So, I think there are 2 ways to fix this:
  * if this syscall cannot block, then we can just remove the
    FUSE_COMPATIBLE_MAY_BLOCK.
  * if the readlink syscall can really block, then it is not ok to
    have the PRE(sys_readlink) doing the work. The PRE(sys_readlink)
    should change the first argument when needed, let the async syscall
    handling do the syscall, and then the POST(sys_readlink) should cleanup
    the changed argument (assuming we need some dynamic memory for the
    changed argument).

So, at this point, what is really needed is a better understanding of
the SfMayBlock logic ...