Bug 395925

Summary: gwenview main menu broken
Product: [Applications] gwenview Reporter: Duncan <1i5t5.duncan>
Component: generalAssignee: Gwenview Bugs <gwenview-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: kde, kossebau, nate, null
Priority: NOR    
Version: Git (add output of "git log -1 --oneline" to description)   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In: 18.04.3
Sentry Crash Report:
Attachments: Hack-around partial-revert patch
Updated hack-around partial-revert patch

Description Duncan 2018-06-27 12:04:27 UTC
git master of pretty much anything kde, including gwenview, frameworks, plasma, etc, built from the gentoo/kde overlay, which has the live-git kde ebuild versions.  For gwenview, currently at ef1244b1c .

Gwenview's main menu has been broken for some months now, leading to a delay of some time while launching gwenview, apparently waiting for some timeout.

I don't use the menu a lot so it took me awhile to figure out what the problem was, but a few days ago I straced a gwenview's startup and it stopped for a long time at some menu related stracing output.  After it did finally show the gwenview window, I tried the menu (now launched from a button in the titlebar), and got no response -- no menu showed, tho that wasn't surprising after I had seen where the strace output stopped as gwenview started.  I've noticed no other apps with menu issues, only gwenview.

With further experimenting, I discovered that if I took the menu button off the titlebar in kde system settings, windecos, buttons tab, then started gwenview, which of course wouldn't have a menu button in the titlebar at that point and would start normally, /then/ added the titlebar menu button back in kde system settings, /then/ gwenview would have a menu.  Further, quitting gwenview and restarting it once it had the menu, it would start fast and have the menu each time... until I quit and restarted kde/plasma, after which gwenview would take a long time to startup again, and the menu button wouldn't work.  However, just toggling the windeco menu button off, applying, and back on, applying, without starting gwenview while the windeco menu button was turned off, would not be enough to get gwenview showing its menu again -- I had to actually start gwenview with the menu button disabled, then add it again, for gwenview to actually have a menu and startup fast again for the rest of that plasma session.

So tonight I started trying to bisect when the problem appeared.  I've not finished the bisect yet, but as of the d0d97b8e1 merge of the 18.04 branch on April 8, the problem was already happening, while if I go back to the 2522e7e7e merge of 17.12 back on Feb 27, the menu was still working fine.

Since I can revert back to Feb 27's state and get a working menu, the problem has been demonstrated to be in gwenview's code since then, so I thought I'd file this bug before finishing the bisect, on the off change a gwenview dev both had a good idea what might have caused it and had time to work on it before I finished the bisect.  I'll continue the bisect and file an update as I have time.  (I should be sleeping as I have work in a few hours.)
Comment 1 Duncan 2018-06-27 12:43:31 UTC
e41d0a9b5, merge 17.12 on March 2, is good.
785281b76, March 19 and master side of merge c6e514eef branch 18.04 on March 20, is bad.

So the culprit is in the series of master-side commits between e41d0a9b5 on March 2 and 785281b76 on March 19.
Comment 2 Duncan 2018-06-27 13:26:38 UTC
And the culprit is...

9631043c1
March 2, 2018
Expose slideshow to MPRIS controllers

8bd2f625c, the previous commit, is fine.
Comment 3 Duncan 2018-06-27 13:39:05 UTC
FWIW... I'm not a dev, so while I can bisect and read the source of a culprit commit, it doesn't necessarily tell me what might be wrong like it might to a dev.

But it may be worth noting that I have (gentoo/kde's) semantic-desktop USE flag turned off.  That means no baloo or kfilemetadata here.  Related?

And based on the qdbus in the commit code, suspecting the version of it I'm running might make a difference.  It's qdbus-5.11.0_rc2 (qt 5.11-rc2 being the latest available in the gentoo tree, satisfying the > 5.10 dep of parts of plasma now, as there's no qt-5.10 in the gentoo tree).
Comment 4 Nate Graham 2018-06-27 17:25:41 UTC
How odd, thanks for the bisecting!

CC'ing Friedrich, who implemented this feature.
Comment 5 Friedrich W. H. Kossebau 2018-06-27 19:57:32 UTC
No instant clue. On first look seems somehow the D-Bus usage of the MPRIS service (together with the D-Bus connection to the ScreenSaver service) gets in conflict with the D-Bus connection used for the app main menu integration.

Adding Kai: you might have the needed experience and thus idea here.

Gwenview does any D-Bus registration at this point in the startup:

#0  Gwenview::Mpris2Service::Mpris2Service (this=0xa673d0, slideShow=0x796310, contextManager=0x791940, toggleSlideShowAction=0xae8bb0, fullScreenAction=0xb04470, 
    previousAction=0xbc7fc0, nextAction=0xc274f0, parent=0x844af0) at /home/koder/Kode/kdegit/kf5/kde/kdegraphics/gwenview/lib/mpris2/mpris2service.cpp:44
#1  0x0000000000468feb in Gwenview::MainWindow::MainWindow (this=0x844af0, __in_chrg=<optimized out>, __vtt_parm=<optimized out>)
    at /home/koder/Kode/kdegit/kf5/kde/kdegraphics/gwenview/app/mainwindow.cpp:856
#2  0x0000000000468324 in StartHelper::createMainWindow (this=0x7fffffffca70) at /home/koder/Kode/kdegit/kf5/kde/kdegraphics/gwenview/app/main.cpp:94
#3  0x0000000000467da6 in main (argc=1, argv=0x7fffffffccd8) at /home/koder/Kode/kdegit/kf5/kde/kdegraphics/gwenview/app/main.cpp:159
Comment 6 null 2018-06-28 17:55:01 UTC
@Nate, @Friedrich: Are you able to reproduce, and if so, how?

> gwenview would take a long time to startup again
@Duncan: Could you quantify how long regular and slow startup is taking for you each, i.e. barely noticable, seconds, or minutes?
Comment 7 Nate Graham 2018-06-28 17:55:51 UTC
I can't reproduce, FWIW.
Comment 8 Duncan 2018-06-29 02:18:56 UTC
(In reply to Henrik Fehlauer from comment #6)
> > gwenview would take a long time to startup again
> @Duncan: Could you quantify how long regular and slow startup is taking for
> you each, i.e. barely noticable, seconds, or minutes?

Note that I'm on (btrfs on) ssd, so even cold-cache startups /should/ be fast.

Regular startup: Under 1 second.  I suspect the kwin slide-in effect is actually delaying the window appearance.  

(I have gwenview's clear-thumbnails on exit option set, and actually have the thumbnails dir pointed to tmpfs, so from cold-cache it does take a slight bit of time to generate the thumbnails on the start-page's recent-folders tab, but the dir icons themselves show up effectively instantly and the thumbnails on them populate in under 10 seconds, I'd say.)

Slow startup: I've not timed it and perception is notoriously unreliable, but the window doesn't show up for long enough I can launch say kpat and start playing a game, before it shows up.

Stracing a slow-start it's quite clear there's some sort of timeout (later diagnosed as the menu load timeout) it waits for, as there's the usual rush of multiple pages of output as it reads all the libs/icons/fonts/config/etc, followed by a pause of I'd say at least 30 seconds, perhaps a minute or two (it was long enough I could select some strace output so I could scroll back to it after the window /did/ show to see where things stopped and notice it was menu related, the clue I needed to test and see that the menu button in the titlebar wasn't responsive), while nothing at all is printed, before it resumes printing more pages of output as the window appears and it regenerates thumbnails for the start page.
Comment 9 Duncan 2018-06-29 02:40:49 UTC
Created attachment 113637 [details]
Hack-around partial-revert patch

Once I bisected to a single commit I had a patch I could try to revert.  But based on the error when I tried there were a couple files that have further changed in that area since then and the revert wouldn't cleanly apply.

So I semi-blind (as I'm not a dev) deleted the chunks that added (and the revert would have deleted) the new files, along with the two conflicting chunks (lib/slideshow.(cpp|h)), and tried that, not even knowing whether gwenview at head would build then, let alone run, but it did.

And it indeed did eliminate the problem -- I have my gwenview menus back, on (yesterday's) current gwenview, with the partial revert, of course.

Here's that semi-blind hack-around partial revert patch.  While I don't expect it to help much in getting a proper fix, it does confirm that commit as the culprit, and lets you see what I did to get a menu on otherwise-current gwenview back.

I can test debugging or alternative patches too, if needed.  Gwenview rebuilds fast, especially with a hot ccache. =:^)
Comment 10 Duncan 2018-07-08 09:46:21 UTC
Created attachment 113829 [details]
Updated hack-around partial-revert patch

Only the mainwindow.cpp mainwindow::mainwindow chunk of the original commit/patch needs reverted
Comment 11 Duncan 2018-07-08 09:55:12 UTC
(In reply to Duncan from comment #8)
> (In reply to Henrik Fehlauer from comment #6)
> > @Duncan: Could you quantify how long regular and slow startup is taking for
> > you each, i.e. barely noticable, seconds, or minutes?
> 
> Regular startup: Under 1 second.

> Slow startup: I've not timed it 

I've (rough-)timed it now.  25 seconds, +/-2, to window appearance.

As I said I'm on ssd and used to near-instant, so it's definitely long enough to get bored and fire up a kpat or ksuduko to play while I wait, tho of course I don't /finish/ a game before it appears.
Comment 12 Duncan 2018-07-08 10:34:42 UTC
(In reply to Duncan from comment #3)
> And based on the qdbus in the commit code, suspecting the version of it I'm
> running might make a difference.  It's qdbus-5.11.0_rc2 (qt 5.11-rc2 being
> the latest available in the gentoo tree, satisfying the > 5.10 dep of parts
> of plasma now, as there's no qt-5.10 in the gentoo tree).

Updated to qt-5.11.1 now.  Didn't change things.

Further notes:

Based on a the menu timeout stalling the creation of the window and a hunch I tried simply patch-moving the mpris setup call to /after/ the createGUI() and loadConfig() calls instead of deleting it as in the current hack-around patch... and gwenview loaded fine, its menu worked, etc, with no ill effects that I could see (tho the media keys didn't work in the slide-show, but then they never have for me so that's not a change).

I see both the kipi and osx ifdefs are after createGUI and loadConfig too.  If the mpris ifdef could likewise be moved lower in the function, and still work to setup mpris, that does seem to solve the menu problem I'm seeing too.


mpris?  Perhaps this requires a component that I don't have installed, that's not tested for and optionally required to enable this functionality, simply assumed to be installed?  Dumb non-dev speculation, but maybe it's actually an mpris dbus call that's not getting a response, because whatever it's trying to call isn't installed, and that's blocking the menu setup dbus call so it doesn't get a response either?

If so and you guys have that mystery mpris component installed, that could explain why you're not duplicating.
Comment 13 null 2018-07-08 11:46:15 UTC
Thanks for the continued investigation, Duncan, that's really helpful. Awesome solution, I'll create a Diff on Phabricator tonight.

Could you try one more thing, so we know a bit better whether the problem is a real fix or more of a workaround for another problem? As you have plenty of time on a slow/broken startup, please attach GDB to the process, interrupt execution and get a backtrace:
$ gdb gwenview |& tee backtrace.log
$ r
- press Ctrl-C while Gwenview is stuck (but after loading debug symbols finished)
$ thread apply all bt
- attach the backtrace to the bug

----

As for your patches, both are basically reverting the whole feature:
- First patch: You don't set HAVE_QTDBUS in CMakeLists.txt anymore, so the rest of the MPRIS integration won't even get compiled.
- Second patch: You skip creating Mpris2Service, which would otherwise be the main entry point for everything else of the MPRIS integration.

----

Regarding MPRIS: That's a DBus service provided by Gwenview (org.mpris.MediaPlayer2.Gwenview), which other parts of the infrastructure can call into because it uses a standardized interface (org.mpris.MediaPlayer2). IOW it isn't a separate dependency to install. For example Plasma's mediaplayer applet can list all services providing that interface and call the forward/play/pause methods. Same thing for KDE Connect.

I think even with the patch we should investigate why your media keys are not working:
- Can you see Gwenview's DBus service in "qdbusviewer"?
- Can you change to the next picture with "dbus-send --session --dest=org.mpris.MediaPlayer2.Gwenview --type=method_call /org/mpris/MediaPlayer2 org.mpris.MediaPlayer2.Player.Next"?
- Can you see other services in "qdbusviewer" implementing the interface, e.g. when running VLC?
- Can you control VLC with Plasma's mediaplayer applet?
- What does your global shortcut configuration in Systemsettings (Media Controller component) look like?
Comment 14 null 2018-07-08 22:33:14 UTC
Diff: https://phabricator.kde.org/D13995
Comment 15 null 2018-07-09 19:51:02 UTC
Git commit 07a2e7f9eccd92c19da4ef269f453d61a44b6846 by Henrik Fehlauer.
Committed on 09/07/2018 at 19:50.
Pushed by rkflx into branch 'Applications/18.04'.

Fix external application menu occasionally slowing down startup

Summary:
If moving the menubar entries to a button in the window decoration is
enabled, the bug reporter experiences delays of up to 25 seconds in the
startup of Gwenview.

Bisecting points to 9631043c1, which added MPRIS support through
providing a D-Bus interface.

While the exact nature of the problem remains unclear due to not being
reproducible outside of the bug reporter's environment, moving the
initialization of the MPRIS support after `createGUI` is reported to
correct the issue. This may be due to avoiding conflicts or races with
the D-Bus connection, which is used for the MPRIS service, the
ScreenSaver service and the application menu integration.

Thanks to Duncan for reporting the bug and coming up with a fix.
FIXED-IN: 18.04.3

Test Plan:
`dbus-send --session --dest=org.mpris.MediaPlayer2.Gwenview --type=method_call /org/mpris/MediaPlayer2 org.mpris.MediaPlayer2.Player.Next` still moves to the next image in Gwenview.
The external application menu is still working fine.

Reviewers: #gwenview, kossebau

Subscribers: broulik

Differential Revision: https://phabricator.kde.org/D13995

M  +6    -6    app/mainwindow.cpp

https://commits.kde.org/gwenview/07a2e7f9eccd92c19da4ef269f453d61a44b6846