Bug 373574

Summary: valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed.
Product: [Developer tools] valgrind Reporter: kujawa
Component: generalAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: normal CC: kujawa, philippe.waroquiers
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: code that reproduces described behaviour

Description kujawa 2016-12-12 13:31:33 UTC
Created attachment 102745 [details]
code that reproduces described behaviour

When running valgrind with --sim-hints=fuse-compatible we hit into given assertion.

We checked that in our case sci->flags & SfMayBlock is true.

trace from our original execution (with valgrind with --trace-syscalls=yes):

SYSCALL[28094,1](12) sys_brk ( 0x0 ) --> [pre-success] Success(0x4000000)
--28094-- REDIR: 0x37458176d0 (ld-linux-x86-64.so.2:strlen) redirected to 0x380550c1 (vgPlain_amd64_linux_REDIR_FOR_strlen)
SYSCALL[28094,1](63) sys_newuname ( 0xffefffb10 )[sync] --> Success(0x0)
--28094-- REDIR: 0x37458174e0 (ld-linux-x86-64.so.2:index) redirected to 0x380550db (vgPlain_amd64_linux_REDIR_FOR_index)
SYSCALL[28094,1](89) sys_readlink ( 0x374581b667(/proc/self/exe), 0xffeffec10, 4096 ) --> [pre-success] Success(0x52)
valgrind: m_syswrap/syswrap-main.c:1938 (vgPlain_client_syscall): Assertion '0 == (sci->flags & ~(SfPollAfter | SfYieldAfter | SfNoWriteResult))' failed.

it shows that "sys_readlink" is suspected blocking call. I attach simple code that reproduces problem when executed by command:
valgrind --tool=memcheck -v --trace-syscalls=yes --sim-hints=fuse-compatible ./sfMayBlockExample.exe
Comment 1 kujawa 2017-02-24 11:34:26 UTC
Hello valgrind people,

what is the status of this bug? Can we expect someone will look at this (or maybe also fix this)?

Radek
Comment 2 kujawa 2018-02-19 13:29:30 UTC
Hi guys,
is there a chance that someone will look at this?

Radek
Comment 3 Philippe Waroquiers 2018-02-24 17:28:45 UTC
I took a quick look at this problem.
I have to admit that I do not too much understand the logic used
to mark some syscall as SfMayBlock,
or mark them as  SfMayBlock only if --sim-hints=fuse-compatible
or not marking them.
For example, sys_rename is FUSE_COMPATIBLE_MAY_BLOCK();
while sys_renameat is not marked.
I am guessing that depending on the fs (in particular, network file systems),
all/most file system operations might potentially block. See e.g. bug 278057
comment 2).

That being said, I think the problem you have is because PRE(sys_readlink)
is doing the complete work of the syscall to possibly have some
special logic needed to handle some /proc filenames.

When using fuse-compatible, the syscall is marked as SfMayBlock but that
is not expected when the PRE(sys_*) is doing all the work : effectively,
if the PRE(sys_*) is doing all the work, how can the syscall block later on ?

So, I think there are 2 ways to fix this:
  * if this syscall cannot block, then we can just remove the
    FUSE_COMPATIBLE_MAY_BLOCK.
  * if the readlink syscall can really block, then it is not ok to
    have the PRE(sys_readlink) doing the work. The PRE(sys_readlink)
    should change the first argument when needed, let the async syscall
    handling do the syscall, and then the POST(sys_readlink) should cleanup
    the changed argument (assuming we need some dynamic memory for the
    changed argument).

So, at this point, what is really needed is a better understanding of
the SfMayBlock logic ...