Bug 456511

Summary: VLC and Firefox freeze / stop updating their window contents after being used for a while (BadDamage)
Product: [Plasma] kwin Reporter: Fushan Wen <qydwhotmail>
Component: compositingAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED FIXED    
Severity: grave CC: adeptsmail, alexandernst, arraybolt3, bizyaev, dave, DianaNites, edzis, felixonmars, hyunkang2019, jessica, karemjaleel34, kde, l12436.tw, MarcMiltenberger, miranda, mundanedefoliation, nate, saucesfm, sephiroth_pk, stransky, svoboa, totpet94, twilbyte, tynach2, uwu, watisthispoo, wolftune, xaver.hugl
Priority: NOR Keywords: regression
Version: masterFlags: uwu: Wayland+
qydwhotmail: X11+
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=429211
https://bugs.kde.org/show_bug.cgi?id=449301
https://bugs.kde.org/show_bug.cgi?id=366651
Latest Commit: Version Fixed In: 5.26
Sentry Crash Report:
Attachments: regression.mp4
Same bug when it happens to a Plasma panel

Description Fushan Wen 2022-07-09 13:12:26 UTC
Created attachment 150500 [details]
regression.mp4

SUMMARY
Firefox (v103b5) and VLC (v3.0.17.4) both stop updating the window contents after being used for a while. Same happens on the panel.


STEPS TO REPRODUCE
1. Normally browse the web and do other things
2. Firefox freezes, and the window content is only updated after changing the window size

OBSERVED RESULT
Freeze

EXPECTED RESULT
No freeze

SOFTWARE/OS VERSIONS
Operating System: openSUSE Tumbleweed 20220705
KDE Plasma Version: 5.25.80
KDE Frameworks Version: 5.96.0
Qt Version: 5.15.5
Kernel Version: 5.18.6-1-default (64-bit)
Graphics Platform: X11
Processors: 8 × AMD Ryzen 7 4700U with Radeon Graphics
Memory: 15.0 GiB of RAM
Graphics Processor: AMD RENOIR
Manufacturer: HP
Product Name: HP ZHAN 66 Pro A 14 G3

ADDITIONAL INFORMATION
It should be a regression introduced in the recent one month.
Comment 1 Kareem 2022-07-12 17:12:15 UTC
Same bug here, it happens with all apps (x11, AMD),
Started on 5.25.
Comment 2 Yuriy Vidineev 2022-07-28 21:43:01 UTC
I'm in the same boat. I'm almost certain it started on 5.25

KDE Neon 5.25 User Edition
Linux 5.15.0-41-generic
I've tried on Intel and NVidia graphics - same effect.
It pretty often (a couple of times a day) happens in Firefox and Thunderbolt. Google Chrome also behaves weirdly. App restart helps till the next freeze. A window resize also refreshes content.
Comment 3 Nate Graham 2022-07-29 02:50:48 UTC
I've been able to reproduce what I *think* is the same issue when I maximize VLC or Firefox on X11. Only those two apps, and only on X11. A window resize refreshes the content, as others have mentioned. Can others reproduce this?
Comment 4 Colin Griffith 2022-08-02 00:59:31 UTC
(In reply to Nate Graham from comment #3)
> I've been able to reproduce what I *think* is the same issue when I maximize
> VLC or Firefox on X11. Only those two apps, and only on X11. A window resize
> refreshes the content, as others have mentioned. Can others reproduce this?

As I mentioned in bug 431446, this also freezes Plasma for me, and all apps. Strange that for you it only happens in those two programs, and only when they are fullscreen. Maybe it's two similar but distinct bugs?
Comment 5 Jessica M 2022-08-03 21:15:17 UTC
This bug is happening to me on an Nvidia machine and an Intel machine. Seems to happen to me while playing random video games and alt tabbing to firefox or jetbrains rider a lot.
Comment 6 David Edmundson 2022-08-17 14:15:47 UTC
This seems reproducible when toggling compositing. Contents are garbage when we suspend compositing, and still broken when we re-enable.
Comment 7 abigail 2022-08-19 14:37:06 UTC
This also happens with me on Steam randomly. 
Running Arch Linux (kernel 5.19.1-arch2-1) and on Plasma 5.25.4.
Comment 8 Bailey 2022-08-29 01:55:19 UTC
Same bug here, KDE Plasma 5.25.4, Archlinux(5.19.4-arch1-1), X11, Intel
Comment 9 Celeste Liu 2022-08-29 06:30:13 UTC
Same bug here, 
Plasma 5.25.4, Wayland, 
GPU: AMD RENOIR (LLVM 14.0.6, DRM 3.47, 5.19.4-zen1-1-zen) in AMD 5800H, Mesa 22.1.7.0, 
Firefox nightly 106.0a1.20220827.19, 
Arch Linux
Comment 10 Dave Lane 2022-09-06 09:16:02 UTC
I seem to be experiencing the same problem - have Firefox running with several windows, lots of tabs (hundreds) using the vertical "tree tabs" extension,  and after a while, sometimes many hours, a FF windows starts flickering between two static frames - especially on, say, videos playing or video conference video (e.g. BigBlueButton)  - and the refresh of all the FF windows stops updating unless I move/resize the window, and then it just redraws once before freezing again... Running KDE Neon 20.04 on an AMD CPU desktop machine (with X11 and AMD graphics, and 5.19.5 kernel). This problem has happened with other kernels. Haven't tried a non-plasma desktop for a while, so can't remember it happening on one, but I might be wrong.
Comment 11 Dave Lane 2022-09-06 09:19:12 UTC
Oops - and I should say I'm running Plasma 5.25.4 and KDE Frameworks 5.97.0
Comment 12 Nate Graham 2022-09-09 17:37:53 UTC
*** Bug 431446 has been marked as a duplicate of this bug. ***
Comment 13 Ilya Bizyaev 2022-09-10 17:01:40 UTC
Created attachment 151963 [details]
Same bug when it happens to a Plasma panel
Comment 14 Ilya Bizyaev 2022-09-10 17:02:39 UTC
Same here.

Operating System: openSUSE Leap 15.4
KDE Plasma Version: 5.25.5
KDE Frameworks Version: 5.97.0
Qt Version: 5.15.5
Kernel Version: 5.14.21-150400.24.18-default (64-bit)
Graphics Platform: X11
Comment 15 Aaron Wolf 2022-09-11 21:07:36 UTC
looks like a duplicate of https://bugs.kde.org/show_bug.cgi?id=449301 maybe
Comment 16 Fushan Wen 2022-09-12 11:50:55 UTC
See also https://gitlab.freedesktop.org/xorg/lib/libxcb/-/issues/51
Comment 17 Aaron Rainbolt 2022-09-14 17:18:05 UTC
This *might* be Qt related - I'm experiencing a similar problem on Lubuntu 22.04.1 with LXQt and Openbox on one system over here. Yes, I know that's not KDE, but KDE and LXQt both use Qt, so perhaps the problem is there? Might be totally unrelated, just throwing that out there in case it turns out to be helpful.
Comment 18 Vlad Zahorodnii 2022-09-15 07:37:52 UTC
(In reply to Aaron Rainbolt from comment #17)
> This *might* be Qt related - I'm experiencing a similar problem on Lubuntu
> 22.04.1 with LXQt and Openbox on one system over here. Yes, I know that's
> not KDE, but KDE and LXQt both use Qt, so perhaps the problem is there?
> Might be totally unrelated, just throwing that out there in case it turns
> out to be helpful.

VLC uses Qt, but FF does not. What compositing manager do you use?
Comment 19 Aaron Rainbolt 2022-09-15 07:48:38 UTC
(In reply to Vlad Zahorodnii from comment #18)
> (In reply to Aaron Rainbolt from comment #17)
> > This *might* be Qt related - I'm experiencing a similar problem on Lubuntu
> > 22.04.1 with LXQt and Openbox on one system over here. Yes, I know that's
> > not KDE, but KDE and LXQt both use Qt, so perhaps the problem is there?
> > Might be totally unrelated, just throwing that out there in case it turns
> > out to be helpful.
> 
> VLC uses Qt, but FF does not. What compositing manager do you use?

I don't think Lubuntu uses any compositor at all by default. The WM is Openbox, Compton *can* be enabled but usually isn't. VLC isn't used much on this system and I've not experienced any problems with it, but Firefox is used quite frequently and this behavior happens somewhat frequently with it.
Comment 20 Nate Graham 2022-09-15 16:50:35 UTC
*** Bug 459138 has been marked as a duplicate of this bug. ***
Comment 21 Felix Yan 2022-09-26 09:05:49 UTC
I get this on both x11 (with either modesetting ddx or intel ddx) and wayland (with MOZ_ENABLE_WAYLAND=1), and switching windows often updates its content.

There are subtle differences though, which may indicate that multiple issues are involved here:

- On X11, sometimes when tab content is not being updated, a big rectangle area (centered, probably 1920x1080 out of 2560x1440) is affected and frozen but the space around it is still being updated correctly (like scrolling the webpage). Switching windows only updates the whole tab once.
- On Wayland, the tab content is often correctly updated all the time, but switching tabs doesn't work and the old tab is still operational. Switching windows will make the tab switching actually happen, for once.

Plasma 5.25.5
KDE Frameworks 5.98.0
Qt 5.15.6+kde+r177
Linux 5.19.10
Comment 22 Fushan Wen 2022-09-26 09:17:38 UTC
If the bug can be seen on Wayland, it can be a new bug in Firefox. CC stransky
Comment 23 Vlad Zahorodnii 2022-09-26 09:20:00 UTC
(In reply to Felix Yan from comment #21)
> I get this on both x11 (with either modesetting ddx or intel ddx) and
> wayland (with MOZ_ENABLE_WAYLAND=1), and switching windows often updates its
> content.

I personally have never seen freezing on wayland. On X11, can you check whether kwin prints BadIdChoice errors in its logs (journalctl --user-unit plasma-kwin_x11) when firefox starts freezing
Comment 24 Martin Stransky 2022-09-26 09:27:40 UTC
It may be caused on Firefox side by depleted file descriptor poll - Firefox fails to allocate new buffers to draw into so transparent window is provided. Check journal if you see any error messages.
Comment 25 Felix Yan 2022-09-26 09:30:35 UTC
(In reply to Vlad Zahorodnii from comment #23)
> On X11, can you check
> whether kwin prints BadIdChoice errors in its logs (journalctl --user-unit
> plasma-kwin_x11) when firefox starts freezing

I have searched the journal since April and got no BadIdChoice results. The log is flooded with BadWindow/BadDamage/BadDrawable though.
Comment 26 Felix Yan 2022-09-26 09:37:18 UTC
(In reply to Martin Stransky from comment #24)
> It may be caused on Firefox side by depleted file descriptor poll - Firefox
> fails to allocate new buffers to draw into so transparent window is
> provided. Check journal if you see any error messages.

I have seen nothing strange from Firefox :(

Since reproducing this takes over 24h I'll give it another try on next reproduction.
Comment 27 Riccardo Robecchi 2022-09-26 09:50:23 UTC
I would tend to exclude a bug in Firefox, considering it also happens with Slack, which is basically Electron and therefore Chromium.
Comment 28 Felix Yan 2022-09-26 10:00:03 UTC
Indeed, the Xorg specific behavior (rectangle region not being updated) could be reproduced by multiple application here too, including Telegram Desktop (Qt6), VSCode (Electron), virt-manager (GTK3).
Comment 29 Bug Janitor Service 2022-10-05 07:21:04 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/kwin/-/merge_requests/3022
Comment 30 Vlad Zahorodnii 2022-10-05 12:04:57 UTC
Git commit b3214db0b7aff89a82dbc4769f22c548f55863cf by Vlad Zahorodnii.
Committed on 05/10/2022 at 07:10.
Pushed by vladz into branch 'master'.

x11: Make damage region fetching code more robust to errors

With DamageReportNonEmpty damage report level, the x server will
send kwin a DamageNotify when the damage region changes from empty to
not empty.

The damage region will be made empty when SurfaceItemX11 calls
xcb_damage_subtract().

It appears like xcb_generate_id() can return us an already associated
XID, which eventually results in xcb_damage_subtract() failing and
breaking state tracking in SurfaceItemX11. KWin will no longer receive
DamageNotify events and some windows will freeze.

In order to make getting BadIdChoice less catastrophic, this change
makes the SurfaceItemX11 reset m_isDamaged after successfully fetching
the damage region. If xcb_generate_id() returns us a bad id, kwin will
try to fetch the damage again in the next frame.

M  +1    -1    src/surfaceitem_x11.cpp

https://invent.kde.org/plasma/kwin/commit/b3214db0b7aff89a82dbc4769f22c548f55863cf
Comment 31 Nate Graham 2022-10-05 18:35:53 UTC
Git commit 062afee75bd61e225d72b89c2b5e413e61cd23d8 by Nate Graham, on behalf of Vlad Zahorodnii.
Committed on 05/10/2022 at 18:35.
Pushed by ngraham into branch 'Plasma/5.26'.

x11: Make damage region fetching code more robust to errors

With DamageReportNonEmpty damage report level, the x server will
send kwin a DamageNotify when the damage region changes from empty to
not empty.

The damage region will be made empty when SurfaceItemX11 calls
xcb_damage_subtract().

It appears like xcb_generate_id() can return us an already associated
XID, which eventually results in xcb_damage_subtract() failing and
breaking state tracking in SurfaceItemX11. KWin will no longer receive
DamageNotify events and some windows will freeze.

In order to make getting BadIdChoice less catastrophic, this change
makes the SurfaceItemX11 reset m_isDamaged after successfully fetching
the damage region. If xcb_generate_id() returns us a bad id, kwin will
try to fetch the damage again in the next frame.


(cherry picked from commit b3214db0b7aff89a82dbc4769f22c548f55863cf)

M  +1    -1    src/surfaceitem_x11.cpp

https://invent.kde.org/plasma/kwin/commit/062afee75bd61e225d72b89c2b5e413e61cd23d8
Comment 32 Eduardo 2022-10-06 20:58:22 UTC
Thanks for the patch, I applied this patch to 5.25.5 version of kwin, let's see whether it solves the issue.
Comment 33 Gauthier 2022-10-08 09:59:43 UTC
*** Bug 343661 has been marked as a duplicate of this bug. ***
Comment 34 Jessica M 2022-10-15 19:12:25 UTC
Updated to 5.26, this bug is still happening to me
Comment 35 Fushan Wen 2022-10-15 23:44:45 UTC
(In reply to Jessica M from comment #34)
> Updated to 5.26, this bug is still happening to me

I have been running 5.26 for a few days and haven't seen the bug.

Are you using NVIDIA?
Comment 36 Jessica M 2022-10-16 01:47:22 UTC
(In reply to Fushan Wen from comment #35)
> (In reply to Jessica M from comment #34)
> > Updated to 5.26, this bug is still happening to me
> 
> I have been running 5.26 for a few days and haven't seen the bug.
> 
> Are you using NVIDIA?

Yes
Comment 37 Hyuk 2022-10-16 07:21:23 UTC
same here
Comment 38 Dave Lane 2022-10-16 07:25:51 UTC
Since installing 5.26 a few days ago, I have not seen the issue. I've got an AMD graphics card (Radeon RX 550) and am running kernel 5.19.14-051914-generic.
Comment 39 Dave Lane 2022-10-16 07:27:03 UTC
(In reply to Dave Lane from comment #38)
> AMD graphics card (Radeon RX 550) and am running kernel
Oops - should be RX 5500
Comment 40 Fushan Wen 2022-10-16 08:13:53 UTC
*** Bug 449301 has been marked as a duplicate of this bug. ***
Comment 41 Fushan Wen 2022-10-16 08:17:03 UTC
(In reply to Jessica M from comment #36)
> Yes

If so it's likely a different bug because the fix is for BadDamage only. Please subscribe to https://bugs.kde.org/show_bug.cgi?id=429211 or file a new bug.
Comment 42 nttkde 2022-11-27 22:22:44 UTC
*** Bug 456608 has been marked as a duplicate of this bug. ***