Created attachment 152586 [details] backtrace of kio-fuse master @v5.0.1-16-gfbd09a3 While giving https://invent.kde.org/system/kio-fuse/-/merge_requests/36 some test usage I've found that sometimes, video playback stops. mpv can still seek into previously cached positions, but does not seem to receive any more data from kio-fuse. Note that new clients can be started and will happily get content (until they too presumably freeze up). This also happens in master (without the above branch applied). I have saved a coredump of kio-fuse for possible further analysis while it was frozen on playback for mpv and the backtrace is attached to the report.
This looks to me like KIO FUSE hasn't replied to a FUSE request. If you run KIO FUSE in debugging mode can you post the output? We should see the request that KIO FUSE isn't replying to.
Created attachment 152744 [details] output of "kio-fuse -d" (@v5.0.1-16-gfbd09a3)
There's definitely a bug in SFTP here that needs to be fixed: > kf.kio.core: finished() called twice! Please fix the kio_sftp KIO slave We can see there's a stall in the call to read() as there is no reply to request 1902. However, I'm not (yet) convinced that the bug in SFTP noted above is causing this. @Harald, could you take a look at this on the SFTP side? @Jan, could you do the following? 1) Provide the version of kio-extras used for this. 2) Debug output for the same, but with my feature branch you were testing.
Oh and in addition, are you able to replicate this with another protocol, such as file?
This is all on an up-to-date arch: kio: 5.99.0-1 kio-extras: 22.08.2-1 kio-fuse: 5.0.1-1 There is an easy way to reproduce this locally: $ dd if=/dev/null of=/tmp/input bs=1 count=0 seek=9G $ sudo systemctl start sshd $ ./kio-fuse/utils/mount-url.sh sftp://localhost $ pv /run/user/1000/kio-fuse-*/sftp/localhost/tmp/input > /dev/null Your filejob feature branch makes this bug much less likely to happen, so a loop is needed instead: $ while true; do pv /run/user/1000/kio-fuse-*/file/tmp/input > /dev/null; done I use pv for the easy progress meter, but dd and cp works too. On master this usually triggers in about 30s-2m. For your feature branch this can take a *lot* longer (sometimes a couple minutes, sometimes an hour or longer). Debug logs are obviously pretty large because of this, so I think it's better to see if you can reproduce yourself first. I cannot reproduce this with file:// or smb:// (on localhost) so far.
That's actually much easier to reproduce than I expected, I can trivially repeat it here. To debug it, I added some code to log all KIO slave commands. That showed that CMD_CLOSE was actually sent twice to the slave, which turned out to be a bug in KIO: https://invent.kde.org/frameworks/kio/-/merge_requests/1013 That's unrelated though and (in this case at least) harmless. I found the root cause a bit later. The sftp ioslave calls setTimeoutSpecialCommand, which means that SFTPWorker::special is called in regular intervals from the dispatching loop. After the sftp slave was ported from SlaveBase to WorkerBase, it now calls finished(), which resets the slave and returns it to the pool. For FileJob this is fatal, as the state is lost and all the application gets is a successful (!) code from the finished() signal! FWICT the right way here is to change WorkerSlaveBaseBridge::special from using finalize(base->special(data)); to maybeError(base->special(data)); and revert 4747517737dcfbfa1964ef818908a69067b7936b.
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/kio/-/merge_requests/1015
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/kio/-/merge_requests/1026
Git commit 99228414ad1050d22e40c9913e7280287b60f22c by Fabian Vogt. Committed on 01/12/2022 at 12:12. Pushed by fvogt into branch 'master'. Ignore finished() from the timeoutSpecialCommand If a timeoutSpecialCommand is set, SlaveBase triggers a call to the Slave's special() method after the given timeout expires. This call is triggered as if it was coming from the application, so it must be taken care of that it does not actually send any result, which the application might not expect. In particular in the case of FileJob, calling finished() stopped the Job immediately, the result() signal is emitted with no error set. Ignore calls to finished() and error() from special() in that case. M +11 -2 src/core/slavebase.cpp https://invent.kde.org/frameworks/kio/commit/99228414ad1050d22e40c9913e7280287b60f22c