Summary: | flickering | ||
---|---|---|---|
Product: | [Frameworks and Libraries] libplasma | Reporter: | pf |
Component: | libplasma | Assignee: | Plasma Bugs List <plasma-bugs> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | CC: | me, notmart |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Attachments: |
annotated journal
Second occurence of screens flashing within a few hours. |
Description
pf
2022-09-06 02:44:55 UTC
Flickering got so bad this morning; had to reboot. Once it starts, issue gets progressively worse. LibreOffice Calc: one document was flickering, another would not accept ANY input. Firefox: popup tooltips (from left menu bar on gitlab page) flickering wildly. Task bar with 2 (~3" wide app icons): left icon not moving, right icon bouncing between its normal position and overlapping left icon ~1/3 at high speed -- as though left icon's "reported width" kept changing while visible width remained unchanged.. Individual taskbar icons flickering color changes (between light blue and lighter blue). So much flickering of different types, it's impossible to identify all flickering modes. Flickering happens randomly at random locations on both screens. it is getting progressively worse, now affecting all apps; in the beginning, it was mainly firefox, and usually after streaming video for a while. Some responses in bug reports try to blamed video hardware; but seeing this on Dell M6800 laptop and on new Dell XPS 8950 (128GB RAM) and AMD Radeon RX 6600 XT video card. Running X11, not Wayland. This seems to have the earmarks of code segment(s) missing lock(s) to prevent being interrupted, with values getting changed by a parallel process... Another variant just now: moving the mouse over the systray, then sliding across the various systray icons, each icon brings up a tooltip. The tooltips vary in size, so one tooltip has a large one (say ~2"x3"); the next tooltip the mouse moves over is a small one -- however, the small tooltip is initially displayed at the size of the large one, and the font size is scaled up to fit (magnified); then that tooltip is scaled down to its normal size. During the moment that the small icon is first displayed the size of the large one, and when it becomes its normal small size; the tooltip gets caught in a flicker between what appears to be the large icon's tooltip and the small icon's tooltip before it can be resized down to its normal size. This time the only way I could stop the flicker was to move mouse to the secondary screen. Each time I try to get a video capture, moving the mouse to another container (see previous comments); it usually stops. Capturing with my phone may be the only way... Operating System: Mageia 9 KDE Plasma Version: 5.25.4 KDE Frameworks Version: 5.97.0 Qt Version: 5.15.6 Kernel Version: 5.19.8-server-1.mga9 (64-bit) Graphics Platform: X11 Processors: 20 × 12th Gen Intel® Core™ i7-12700K Memory: 125.5 GiB of RAM Graphics Processor: AMD Radeon RX 6600 XT Manufacturer: Dell Inc. Product Name: XPS 8950 BIOS updated from 1.3.0 to 1.6.0 yesterday. I've been working with shotcut (video editor); the flickering started affecting various regions of that window. After window shading and restoring the window, I noticed something new/strange: keystrokes were totally ignored (space for start/stop playing, etc.) Then, the entire window started to window-shade on/off on its own. Each on/off appeared synchronized with clock updates -- each second (sometimes 2 seconds), the window would be window-shaded, then restored, etc. Very slow flickering. Referring back to my "container" comments, it appears various containers (within a window (areas of the window), and the entire window) flicker independently. There is no consistency in the speed of flicker -- milliseconds for some flickering; now, on/off cycles in seconds for the entire window. Other window/applications are not affected. It's been suggested that this could be a screen issue -- I consider this premise impossible because physical monitors are only aware of video streams as pixels; they have no concept of display areas varying in size from a few pixels flickering to entire application windows flickering. When flickering starts, an application can also appear visually frozen. Moving the window as little as one pixel refreshes the window into the view it should have; but it instantly appears frozen again. Sigh... so many symptoms... though they appear "container" related. Like updating a window area, or an entire window, and anything in between... Is there some common code that handles painting all areas regardless of size? This is beginning to feel like a memory leak or heap overflow. I'm mainly a user who writes python code; not a system developer. Created attachment 152675 [details]
annotated journal
Was waiting for a convenient time to reboot; but system had other plans...
Attached journal starts way before the failure in case there is something therein that provides a clue. I've added whitespace around some entries plus some comments; but nothing was removed.
After screens went dark and started flashing narrow horizontal video strips; I tried restarting KDE with Ctrl+Alt+Bkspc a few times; but while it did provide login screen, it never completed login -- only displayed Breeze Splash Screen IIRC...
Created attachment 152676 [details]
Second occurence of screens flashing within a few hours.
Just occurred again. This time the journal is from the earlier reboot. I was able to restore my DKE/Plasma session by killing kded5 which didn't appear to help; but killing startpalsma-x11 stopped the flashing and gave me an sddm login, so I'm back up without rebooting. This journal looks just like the previous one:
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:1 pasid:32769, for process Xorg pid 22526 thread Xorg:cs0 pid 22807)
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800116201000 from client 0x1b (UTCL2)
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00141051
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x1
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:1 pasid:32769, for process Xorg pid 22526 thread Xorg:cs0 pid 22807)
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800116200000 from client 0x1b (UTCL2)
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00141051
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0
Oct 09 21:30:03 pf.pfortin.com kernel: amdgpu 0000:03:00.0: amdgpu: RW: 0x1
In case it matters, the "page starting at address" alternates between 0x0000800116201000 and 0x0000800116200000
I captured a video of the problem with my phone a few weeks ago; this what I saw twice so far today: https://drive.google.com/file/d/1xgzy1zE-TlHeC5pAouvkfKkUIzDakPVf/view?usp=sharing The screens are secondary(left) and primary(right). After killing startpalsma-x11 (comment 6), and __avoiding streaming video__, the system has been quite stable until evening of 10/14/22 (about 4 days); then I decided to install a 4-port USB PCIe card which has issues. While examining dmesg, I noticed these amdgpu messages; postin here in case there's a clue within: [ 2.176993] AMD-Vi: AMD IOMMUv2 functionality not available on this system - This is not a bug. [ 2.848136] [drm] amdgpu kernel modesetting enabled. [ 2.848185] amdgpu: CRAT table not found [ 2.848186] amdgpu: Virtual CRAT table created for CPU [ 2.848190] amdgpu: Topology: Add CPU node [ 2.848289] Console: switching to colour dummy device 80x25 [ 2.848311] amdgpu 0000:03:00.0: vgaarb: deactivate vga console [ 2.848352] amdgpu 0000:03:00.0: enabling device (0006 -> 0007) [ 2.850196] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT [ 2.850196] amdgpu: ATOM BIOS: BR77997.001 [ 2.850203] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default) [ 2.850336] amdgpu 0000:03:00.0: BAR 2: releasing [mem 0x4010000000-0x40101fffff 64bit pref] [ 2.850348] amdgpu 0000:03:00.0: BAR 0: releasing [mem 0x4000000000-0x400fffffff 64bit pref] [ 2.850380] amdgpu 0000:03:00.0: BAR 0: assigned [mem 0x4200000000-0x43ffffffff 64bit pref] [ 2.850386] amdgpu 0000:03:00.0: BAR 2: assigned [mem 0x4100000000-0x41001fffff 64bit pref] [ 2.850425] amdgpu 0000:03:00.0: amdgpu: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used) [ 2.850427] amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF [ 2.850428] amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF [ 2.850673] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist [ 2.850677] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist [ 4.064604] amdgpu 0000:03:00.0: amdgpu: STB initialized to 2048 entries [ 4.065094] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware [ 4.259768] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 4.281896] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available [ 4.281920] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2900 (59.41.0) [ 4.281925] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched [ 4.281962] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable [ 4.330961] amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully! [ 4.463147] amdgpu: sdma_bitmap: ffff [ 4.463185] amdgpu: SRAT table not found [ 4.463186] amdgpu: Virtual CRAT table created for GPU [ 4.463332] amdgpu: Topology: Add dGPU node [0x73ff:0x1002] [ 4.463333] kfd kfd: amdgpu: added device 1002:73ff [ 4.463349] amdgpu 0000:03:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 8, active_cu_number 32 [ 4.463380] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 [ 4.463380] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [ 4.463381] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [ 4.463381] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 [ 4.463382] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 [ 4.463382] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 [ 4.463382] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 [ 4.463383] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 [ 4.463383] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 [ 4.463384] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 [ 4.463384] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 [ 4.463385] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 [ 4.463385] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1 [ 4.463385] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1 [ 4.463386] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1 [ 4.463386] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1 [ 4.464248] amdgpu 0000:03:00.0: amdgpu: Using BACO for runtime pm [ 4.464401] [drm] Initialized amdgpu 3.48.0 20150101 for 0000:03:00.0 on minor 0 [ 4.468844] fbcon: amdgpudrmfb (fb0) is primary device [ 4.468872] [drm] DSC precompute is not needed. [ 4.647304] Console: switching to colour frame buffer device 240x67 [ 4.664260] amdgpu 0000:03:00.0: [drm] fb0: amdgpudrmfb frame buffer device [ 9.616281] snd_hda_intel 0000:03:00.1: bound 0000:03:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu]) Not sure what resolved this; but system has been stable for nearly 3 weeks. Should I reopen this bug or create a new one? There is still some flickering; but it's mild. It most often occurs if I move the mouse into the systray, then slide left towards the panel's desktop selector. As it moves across the application icons, flickering happens briefly, and clears up when the mouse is moved up and away from the panel. |