Bug 451059 - Okular freezes for up to 10 seconds when loading a document or switching windows
Summary: Okular freezes for up to 10 seconds when loading a document or switching windows
Status: REPORTED
Alias: None
Product: okular
Classification: Applications
Component: general (show other bugs)
Version: 21.12.2
Platform: Debian unstable Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-02 23:17 UTC by Amaury Pouly
Modified: 2022-06-02 21:16 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
D-BUS log of a such an instance (153.47 KB, application/vnd.tcpdump.pcap)
2022-03-02 23:17 UTC, Amaury Pouly
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Amaury Pouly 2022-03-02 23:17:04 UTC
Created attachment 147249 [details]
D-BUS log of a such an instance

I have a large number of okular instances running (typically 20 or more) and very often either loading a new document or just switching between windows (or switching between virtual desktops) will make okular completely freeze for a long time. Quite regularly, the freeze is short enough to be an annoyance (a couple of seconds), but every now and then, it can freeze for 10 seconds and more.

I tried to debug the code myself but I am out of my depth. I got to suspect, however, that it is related to okular asking other okular instances questions over D-BUS, specifically the canOpenDocs call. I attach a D-BUS trace showing an instance when opening a document will trigger several very slow canOpenDocs calls (5 seconds).

Unfortunately, I haven't found yet a 100% recipe to reproduce it but clearly:
1) it only happens with many okular instances running
2) switching virtual desktops seems to make it worse

Finally, the bug is not new, I has been happening for some time but progressively got worse as my number of open documents has increased recently.

I happy to provide more logs/recompile stuff if needed, I just don't really know where to start at the moment.

STEPS TO REPRODUCE
1. open lots of okular documents (I use PDFs), the size of the PDFs does not seem to matter
2. switch between windows/desktops
3. open a new document, hope to reproduce the bug, otherwise close it
4. return to step 2

OBSERVED RESULT
opening/switching to a windows will take anywhere between 2 to 10 seconds

EXPECTED RESULT
should be almost instantenous


SOFTWARE/OS VERSIONS
Linux:  5.16.0-3-amd64
KDE Plasma Version: 5.24.2
KDE Frameworks Version: 5.90.0
Qt Version: 5.15.2

ADDITIONAL INFORMATION
Comment 1 Albert Astals Cid 2022-03-02 23:27:04 UTC
This is not major by any definition of major.
Comment 2 Albert Astals Cid 2022-03-02 23:27:49 UTC
I understand you don't have tabs enabled, right?
Comment 3 Amaury Pouly 2022-03-02 23:37:14 UTC
(In reply to Albert Astals Cid from comment #2)
> I understand you don't have tabs enabled, right?

Yes that's correct.
Comment 4 Albert Astals Cid 2022-03-02 23:55:56 UTC
I just tried with 32 okular instances open and the canOpenDocs elapsed time for all the instances when opening a new one never took more than 75ms (tried 20 times or so), which makes sense, canOpenDocs should be very fast since it do much.

When you say "open a new document", how exactly are you opening it?

Any peculiarity of your system that you think may cause this? Remote NFS file system? weird libc? weird dbus rules?
Comment 5 Amaury Pouly 2022-03-03 00:04:59 UTC
(In reply to Albert Astals Cid from comment #4)
> I just tried with 32 okular instances open and the canOpenDocs elapsed time
> for all the instances when opening a new one never took more than 75ms
> (tried 20 times or so), which makes sense, canOpenDocs should be very fast
> since it do much.
> 
> When you say "open a new document", how exactly are you opening it?
> 
> Any peculiarity of your system that you think may cause this? Remote NFS
> file system? weird libc? weird dbus rules?

I usually either open it using Dolphin or using Zotero, but it can also happen just when switching between virtual desktops and not opening anything (less common though). Everything is local (not NFS), nothing particularly custom, no special dbus rules.

When it happens, switching between virtual desktops is a often a trigger. For example:
1) open file A.pdf => fast
2) switch to other virtual desktop
3) open file A.pdf => slow
4) close it, open A.pdf => fast
5) switch to other virtual desktop
6) open A.pdf => slow
Comment 6 Albert Astals Cid 2022-03-04 23:31:12 UTC
But if it happens just by switching virtual desktops, it has nothing to do with dbus?

dbus is not involved in that scenario that i can see
Comment 7 Amaury Pouly 2022-03-08 01:54:19 UTC
(In reply to Albert Astals Cid from comment #6)
> But if it happens just by switching virtual desktops, it has nothing to do
> with dbus?
> 
> dbus is not involved in that scenario that i can see

That's a good point.
Do you have any advice on how I could track down this problem?
Comment 8 Albert Astals Cid 2022-03-08 23:03:45 UTC
Personally i don't :/
Comment 9 Amaury Pouly 2022-04-11 16:40:55 UTC
Hi, I have made progress on this (I think)!
By running okular in a debugger and stopping the program when it freezes, I have noticed that almost every time the backtrace is as follows (slightly abbreviated):

#0  0x00007ffff6231be5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7fffffffd0f0, rem=0x7fffffffd0f0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:43
#1  0x00007ffff6236ca3 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x00007ffff686ca5d in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#3  0x00007ffff674b01f in QLockFile::tryLock(int) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#4  0x00007ffff79adc59 in KConfigIniBackend::lock (this=0x5555556e12b0) at ./src/core/kconfigini.cpp:632
#5  0x00007ffff7998796 in KConfigPrivate::lockLocal (this=0x5555556e2a90) at ./src/core/kconfig.cpp:123
#6  KConfig::sync (this=0x5555556e2b50) at ./src/core/kconfig.cpp:422
#7  0x00007ffff7c97c0e in KMainWindow::saveAutoSaveSettings (this=<optimized out>) at ./src/kmainwindow.cpp:882
#8  0x00007ffff684c123 in ?? () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#9  0x00007ffff684ffca in QTimer::timeout(QTimer::QPrivateSignal) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5

Recall that I have many (20+) instances of okular running so my theory is now that there might be a lot of contention on the config file lock. This can be easily checked by running (in my case):
    inotifywait -m -r /home/pamaury/.config/
and observe that okular is *very very frequently* taking a lock okularrc.lock to write okularrc.
Actually, this is slightly scary: just switching between virtual desktop will trigger a flurry of creation/deletion of this file.
I cannot explain yet *how* this can possibly lock for 10 seconds however.
Comment 10 Amaury Pouly 2022-04-11 21:48:52 UTC
(In reply to Amaury Pouly from comment #9)
> I cannot explain yet *how* this can possibly lock for 10 seconds however.
I think this behaviour comes from QLockFile: https://github.com/RSATom/Qt/blob/master/qtbase/src/corelib/io/qlockfile.cpp
In QLockFile::tryLock(), the sleep time is doubled on every failure, up to 5 seconds. So if all okular instances rush to lock the file, it is actually possible that this happens.