Bug 452627

Summary: The notification system can become corrupted - notifications fail to update
Product: [Frameworks and Libraries] frameworks-knotifications Reporter: Michael Hamilton <michael>
Component: generalAssignee: kdelibs bugs <kdelibs-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: kde, kdelibs-bugs-null, nate
Priority: NOR    
Version First Reported In: 5.93.0   
Target Milestone: ---   
Platform: openSUSE   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=452732
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Notifications test bash script
Test python script, creates a notification and updates it twice.

Description Michael Hamilton 2022-04-14 21:59:54 UTC
Created attachment 148165 [details]
Notifications test bash script

SUMMARY
The notification system can become corrupted.  

I'm not exactly sure how. Possibly by raising too multiple notifications at the same time.  Possibly by updating a notification too soon.  Possibly by using ID's that weren't generated by the notification system.  

Once corrupted, updating existing notifications does not work: the initial notification displays and returns an id, for example 61, but updating that id results in  the message "org.kde.plasma.notifications: Trying to replace notification with id 61 which doesn't exist, creating a new one. This is an application bug!", a new notification appears above the original one, it returns the same ID, and this new notification can then be successfully updated in place.  So I see to notifications on the screen, and only the second one updates.

Once corrupted, the only way I know to reset notifications is to logout and login, then updates work as expected, the initial notification returns and ID, and updates to the ID change this initial notification.  

STEPS TO REPRODUCE
1. Unsure exactly how to force the system into an error state.
2. i)  Raise a notification and record it's ID
    ii) Run attached shell script with starting ID 0, and two or more repeats, for example:
        % bash test_notifications.sh 0 3
3. Check whether you only see one notification and it gets updated the correct number of times, if not the fault has 
    occurred

OBSERVED RESULT
When the notification system is corrupted, you will see two notifications: the initial one, and one for the first update.  The one for the first update will actually update for the rest of the iterations.

EXPECTED RESULT
I would always expect to see one notification and it would update for the set number of iterations.


SOFTWARE/OS VERSIONS
Linux/KDE Plasma: openSUSE Tumbleweed 20220411
KDE Plasma Version: 5.24.4
KDE Frameworks Version: 5.93.0
Qt Version: 5.15.2

ADDITIONAL INFORMATION
Can the notification subsystem be reset/restarted without a logging out?  That would make testing much easier.
Comment 1 Michael Hamilton 2022-04-14 22:08:24 UTC
Created attachment 148166 [details]
Test python script, creates a notification and updates it twice.

This python script tests the same error, but via the python dbus library.
Comment 2 Kai Uwe Broulik 2022-04-18 18:23:08 UTC
Thanks for your effort! You may restart plasmashell (Type "plasmashell --replace" into krunner) to reload the notification system.

I wonder if there could be a race between the notification being added to the model and us returning the ID. But it's single-threaded, so the signal about the notification is emitted, the notification added, and only then the function returns the ID.

Does the notification system completely break once in corrupt state? i.e. will replacing any notification fail after that or just once or just specific ones?
Comment 3 Michael Hamilton 2022-04-18 23:08:57 UTC
(In reply to Kai Uwe Broulik from comment #2)
> Thanks for your effort! You may restart plasmashell (Type "plasmashell
> --replace" into krunner) to reload the notification system.

Thanks, that makes dealing with this much easier.

> I wonder if there could be a race between the notification being added to
> the model and us returning the ID. But it's single-threaded, so the signal
> about the notification is emitted, the notification added, and only then the
> function returns the ID.

Some kind of race might play a role.  My desktop does raises quite a few notifications.  I used the notificatifications system to keep me alerted on background activity and email filtering.

I've written an application to filter the journal and raise notifications (https://github.com/digitaltrails/jouno).  That can raise bursts of notifications, but it does not update them.  

I've written another one to watch for CPU and memory hogs and raises notifications when a process exceeds specified limits (https://github.com/digitaltrails/procno).  I recently added the ability to periodically update notifications with ongoing consumption.  Testing this new feature lead me to raise this bug.   I also added notification-actions, if they are enabled at the same time as notification-updates, similar problems occur, it could just be the same issue, miss-association of ID's, causing actions not to fire.  

Both the above applications are self contained scripts, with fairly basic PyQt+Dbus dependencies, so you're welcome to use them for testing should they be of any help.

> 
> Does the notification system completely break once in corrupt state? i.e.
> will replacing any notification fail after that or just once or just
> specific ones?

Once it's gone off the rails, it seems to stay broken, same pattern recurs: two notifications instead of one, with the second one updating.   It's less common but sometimes I think that the second notification may not be appearing at all.
Comment 4 Michael Hamilton 2023-02-26 06:40:49 UTC
This could be closed as fixed.  The notification system has been working well since the later part of 2022, I presume improvements unrelated to this bug report have resolved the issue.