Bug 489952 - Inconsistent frame timing of cursor on desktop and some apps
Summary: Inconsistent frame timing of cursor on desktop and some apps
Status: REPORTED
Alias: None
Product: kwin
Classification: Plasma
Component: performance (show other bugs)
Version: 6.1.2
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-08 23:57 UTC by pallaswept
Modified: 2024-08-21 16:08 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Mouse stutter on desktop (3.25 MB, image/png)
2024-07-08 23:57 UTC, pallaswept
Details
kwin log 60Hz display (330.11 KB, text/csv)
2024-08-05 19:49 UTC, pallaswept
Details
kwin log 120Hz display (368.47 KB, text/csv)
2024-08-05 19:49 UTC, pallaswept
Details
desktop 60hz (40.01 KB, text/csv)
2024-08-05 23:17 UTC, pallaswept
Details
desktop 120Hz (97 bytes, text/csv)
2024-08-05 23:19 UTC, pallaswept
Details
firefox 60Hz (136.04 KB, text/csv)
2024-08-05 23:19 UTC, pallaswept
Details
firefox 120Hz (304.12 KB, text/csv)
2024-08-05 23:20 UTC, pallaswept
Details

Note You need to log in before you can comment on or make changes to this bug.
Description pallaswept 2024-07-08 23:57:43 UTC
Created attachment 171487 [details]
Mouse stutter on desktop

SUMMARY
if you want to see this, just take your mouse, and drag it in a line across your display. 

What you should see, courtesy of pixel response times leaving behind ghosts of the previous frames, is a line of cursors, evenly spaced, like so:
x    x    x    x    x    x    x    x    x    x    x    x    x    x    x    x    

What you will see instead, is a broken line of cursors, like so:
x    x    x    x          x    x    x    x         x    x    x    x          x 

If this isn't hanging around on screen long enough just move the mouse in circles. Instead of seeing a nice evenly spaced 'polygon' of pointers, you'll see them clumped together - or you'll notice the 'breaks' in the circle of cursors. For me, the pattern I can see on screen is 3 pointers, then a gap, like `x x x  x x x  x x x  ` The most number of these 'breaks' we should ever be able to see, if the cursor movement is consistently drawn, is one - after the least recent one (in other words, at the end).

This does not occur when the cursor is above an application, or if the cursor has been magnified by the 'shake cursor' effect, then, it is drawn with perfect consistent timing. It does occur over the desktop, even if the same app as mentioned is visible on that display (so, it seems the entire display is not effected, just the mouse cursor). This does occur over panels on the desktop. This occurs on both VRR and fixed-refresh monitors, with or without VRR enabled.


STEPS TO REPRODUCE
1. Move mouse

OBSERVED RESULT
Stutter

EXPECTED RESULT
Smooth

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: Tumbleweed
(available in About System)
KDE Plasma Version: 6.1.2
KDE Frameworks Version: 6.3.0
Qt Version: 6.7.2

ADDITIONAL INFORMATION
image attached shows a 120Hz monitor with the mouse being moved quickly at consistent speed across a black desktop
Comment 1 Akseli Lahtinen 2024-07-12 12:40:30 UTC
Can you share more information of your system? System settings -> About this system -> click on "copy details" and paste it here. Thanks!

Unfortunately I am unable to reproduce this, or I just don't notice it. 

Operating System: Fedora Linux 40
KDE Plasma Version: 6.1.80
KDE Frameworks Version: 6.5.0
Qt Version: 6.7.2
Kernel Version: 6.9.7-200.fc40.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 12 × AMD Ryzen 5 3600 6-Core Processor
Memory: 15.5 GiB of RAM
Graphics Processor: AMD Radeon RX 6600
Comment 2 pallaswept 2024-07-13 20:34:03 UTC
(In reply to Akseli Lahtinen from comment #1)
> Can you share more information of your system? System settings -> About this
> system -> click on "copy details" and paste it here. Thanks!
> 
> Unfortunately I am unable to reproduce this, or I just don't notice it. 

Hi Akseli, thanks for looking at this! 

It's not the sort of thing everyone would notice. Very subjective, and it's quite hard to see, too, I mostly felt it. A good monitor with a fast response time or strobing or the like, might visually (but not tangibly, if one is sensitive to latency variation) hide it entirely. It was only when moving across my black desktop at 120Hz that I realised what I was feeling. Over an app's UI/content, it's hard to see.

Perhaps this is why I hadn't noticed, but I am seeing this in other apps. As mentioned, I didn't see it over a firefox window, but when I tested other apps just now, I could see it there, too. I guess that means this bug may need to be reassigned? My apologies. 

For an example of this, the "About this System" app, if I have it open above an otherwise empty desktop, it will do it, but if I maximize firefox behind it, then maximize it above firefox, then it doesn't do it - so it seems not even app-specific, but more than that. Seems like maybe kwin?

Here's that info:

Operating System: openSUSE Tumbleweed 20240711
KDE Plasma Version: 6.1.2
KDE Frameworks Version: 6.3.0
Qt Version: 6.7.2
Kernel Version: 6.9.7-1-default (64-bit)
Graphics Platform: Wayland
Processors: 24 × AMD Ryzen 9 5900X 12-Core Processor
Memory: 31.3 GiB of RAM
Graphics Processor: NVIDIA GeForce RTX 3090/PCIe/SSE2
Manufacturer: ASUS
Comment 3 pallaswept 2024-07-27 07:57:29 UTC
(In reply to pallaswept from comment #2)
> It's not the sort of thing everyone would notice. Very subjective

I just noticed this very poorly worded description. My apologies for any confusion I may have caused. To clarify what I meant: The ability to notice it, is subjective - As in, some people can, some people can't, maybe some monitors will show it more or less, etc... But the behaviour itself, is objectively, measurably broken. In layman's terms, it's 'dropping frames'.

Thanks for correctly re-assigning this for me Nate!
Comment 4 Zamundaaa 2024-08-02 17:18:07 UTC
What NVidia driver version are you on? Do you still see this if you set the kernel boot argument
> nvidia.NVreg_EnableGpuFirmware=0
?
Comment 5 pallaswept 2024-08-02 17:23:29 UTC
(In reply to Zamundaaa from comment #4)
> What NVidia driver version are you on? 

550.100

> Do you still see this if you set the kernel boot argument
> nvidia.NVreg_EnableGpuFirmware=0

I do.

Thanks again for checking this one out.
Comment 6 pallaswept 2024-08-02 20:02:47 UTC
Hi Zamundaaa,

I hope you might be so kind as to lend me some advice. I hope this is not too far off-topic, although I would use this to troubleshoot this issue.

I haven't done anything like this on linux yet. On Windows, I might look into this with a tool called GPUView. It is maintained by MS now but this page from the author explains it better than I can: https://graphics.stanford.edu/~mdfisher/GPUView.html

Is there a tool (or suite of tools) like that for KDE or linux in general? Or what I really want to ask is, is there one that you personally recommend?
Comment 7 David Edmundson 2024-08-02 21:15:09 UTC
gpuvis: https://github.com/mikesart/gpuvis

we have some docs at: https://invent.kde.org/plasma/kwin/-/wikis/Using-FTrace-Markers
but it's still quite overwhelming .
Comment 8 pallaswept 2024-08-03 16:40:36 UTC
(In reply to David Edmundson from comment #7)
> gpuvis: https://github.com/mikesart/gpuvis

Cheers! I had seen this one, but wasn't sure it applied to either KDE or the nvidia card. 

Do you know any magic for tracing nvidia events? I gather it's not possible, but thought I should ask.

> but it's still quite overwhelming .

You're not wrong about that :) I'll try to get something useful soon. 

At a first glance - keeping in mind I'm a noob at this on linux - I think I may see a problem. 

I can clearly see by the plot, which monitor my mouse was on, by the drm_vblank_event_deliveredN events on the corresponding monitor's row.

I can only assume by the name, these events should be spaced according to the monitor's refresh rate; 16.67ms@60Hz/8.3ms@120Hz, for my two monitors. But what I'm seeing is that every one of these events is spaced shorter than the maximum refresh rate of the monitor - it's at 14.5ms and 6.4ms. Is it just me or is that weird?
Comment 9 pallaswept 2024-08-03 17:05:52 UTC
In addition to the observation above, I just captured a trace which had the mouse moving in circles over the desktop, and then over firefox (about:blank).

When the mouse was over firefox, I see the drm_vblank_event_delivered1 events in the row for nvidia-modeset. Those are all correctly timed, at 8.3ms apart.

Then, I minimise that app, and while the mouse is over the desktop, the blank events appear in the DP-1 row. These are all incorrectly timed, at ~6.4ms.

Weird?
Comment 10 Zamundaaa 2024-08-05 19:15:37 UTC
Yeah, that's very weird.
We have an environment variable to debug performance issues in general, though I'm not sure if it'll help here, it's worth a shot: https://invent.kde.org/plasma/kwin/-/wikis/Environment-Variables#kwin_log_performance_data

Just set it for KWin, and it should put some csv files with performance logging in your home directory, which we can analyze
Comment 11 pallaswept 2024-08-05 19:49:00 UTC
Created attachment 172318 [details]
kwin log 60Hz display

(In reply to Zamundaaa from comment #10)
> Just set it for KWin, and it should put some csv files with performance
> logging in your home directory, which we can analyze

Attached. I collected just a few minutes usage, first with the mouse over the desktop or apps which don't 'fix' this, and at the end, I spent a good chunk of time on the 120Hz monitor with the mouse over firefox as a comparison. Let me know if I could run more specific tests. I thought maybe I could collect traces simultaneously to illustrate the difference.

I honestly can't tell one from the other, in these logs. Kwin runs like a clock (literally never drops so much as a nanosecond, wow, nice work).
Comment 12 pallaswept 2024-08-05 19:49:30 UTC
Created attachment 172319 [details]
kwin log 120Hz display
Comment 13 Zamundaaa 2024-08-05 21:12:23 UTC
Okay, there's 58 frames dropped because of late atomic commits in the log of the 60Hz monitor (vs. only 10 on the 120Hz one), which would also affect the cursor. I think the fix for bug 490358 should help here too.
Comment 14 pallaswept 2024-08-05 23:16:34 UTC
(In reply to Zamundaaa from comment #13)
> Okay, there's 58 frames dropped because of late atomic commits in the log of
> the 60Hz monitor (vs. only 10 on the 120Hz one), which would also affect the
> cursor. I think the fix for bug 490358 should help here too.

Hmm, I'm not so sure that's the same issue I'm seeing - In this timeframe on the 120Hz monitor I would have had in the region of hundreds of frames where it should have drawn the cursor but didn't. If I draw a circle on my desktop I see several, maybe 10 breaks in that single circle of cursors, and in these logs I draw many dozens of circles. 

That bug is also reported as being worse under CPU load, whereas mine is not. There's no effect by any CPU load I've been able to generate with stress-ng or everyday apps, and I have no gradient of good-to-bad effect with mine, there's no better/worse, it's either doing it, or it isn't. The only thing that seems to have any effect is what the cursor is being drawn above.
I also wonder about this, on my 5900X+3090 system, this rig is getting old but it's still pretty fast. I haven't read into your patch, but f you had to slow down kwin for this system, I'd say there might be something else wrong.

I attempted to better capture the two behaviours in my logs. Attached are four files - one for each monitor, and one for each of two scenarios. The first scenario, I reboot, open the launcher, draw 20 circles on the desktop on the 60Hz hdmi monitor, then 20 circles on the 120Hz DP monitor. The second scenario is the same, but I open firefox before drawing the circles, and maximise it to the display I'm drawing on, so I'm drawing circles over firefox rather than over the desktop.

The immediately most interesting thing about this is that yes, that's a 0 byte file, for the desktop (no firefox) scenario, on the 120Hz monitor (my secondary). I'm not sure if that's a sign that this logging can't capture this fault, or if it's a hint as to why this fault is seen?

Hope this is helpful.
Comment 15 pallaswept 2024-08-05 23:17:21 UTC
Created attachment 172329 [details]
desktop 60hz
Comment 16 pallaswept 2024-08-05 23:19:12 UTC
Created attachment 172330 [details]
desktop 120Hz
Comment 17 pallaswept 2024-08-05 23:19:49 UTC
Created attachment 172331 [details]
firefox 60Hz
Comment 18 pallaswept 2024-08-05 23:20:10 UTC
Created attachment 172332 [details]
firefox 120Hz
Comment 19 pallaswept 2024-08-05 23:34:00 UTC
(In reply to pallaswept from comment #14)
> only thing that seems to have any effect is what the cursor is being drawn above.

Sorry, I forgot about the 'shake cursor' effect. If I draw these test circles too fast, that kicks in, and that also makes the frame timing smooth.

I did do that briefly by accident in the first test (desktop 60Hz) and thought that I should mention it.
Comment 20 pallaswept 2024-08-07 01:35:21 UTC
(In reply to pallaswept from comment #14)
> The immediately most interesting thing about this is that yes, that's a 0
> byte file, for the desktop (no firefox) scenario, on the 120Hz monitor (my
> secondary). 

I tried to record this with spectacle today. It recorded no frames until I accidentally got too close to the screen edge, recorded a few frames where the edge highlight animated, and no others afterward. The resulting mp4 file is effectively broken as a result, you can't seek, etc. Let me know if you'd like me to attach it, perhaps a closer analysis might pull something useful from it, but visually, it's just a black rectangle with a mostly-stationary cursor on it.

These results seem related, so I thought I should mention it.
Comment 21 Zamundaaa 2024-08-07 02:45:35 UTC
(In reply to pallaswept from comment #14)
> Hmm, I'm not so sure that's the same issue I'm seeing - In this timeframe on
> the 120Hz monitor I would have had in the region of hundreds of frames where
> it should have drawn the cursor but didn't. If I draw a circle on my desktop
> I see several, maybe 10 breaks in that single circle of cursors, and in
> these logs I draw many dozens of circles. 
It's expected that the tool won't capture all dropped cursor updates - it only records data for frames where something except the cursor has changed. So if you move the cursor in a circle and the window below it didn't update, it won't log anything in that duration.
That should get fixed with some near-ish future changes in KWin, but for now you could get a more accurate reading by leaving glxgears running on the screens while doing the testing.

> Sorry, I forgot about the 'shake cursor' effect. If I draw these test circles too fast, that kicks in, and that also makes the frame timing smooth.
That's an important piece of information... the effect forces a software cursor. If you set the environment variable KWIN_FORCE_SW_CURSOR=1 for KWin, does the issue go away permanently?
Comment 22 pallaswept 2024-08-07 03:05:25 UTC
(In reply to Zamundaaa from comment #21)
> for now you could get a more accurate reading by leaving glxgears running on the screens while doing the testing.

That appears to stop this glitch, too. For testing consistency I also tried this with firefox, just resized it to smallest, and left it in the corner. Like this, the whole desktop around it, is fine.

> If you set the environment variable KWIN_FORCE_SW_CURSOR=1 for KWin, does the issue go away permanently?

Yes. 
Ooh, smoooooth :)
Comment 23 pallaswept 2024-08-07 03:26:10 UTC
(In reply to pallaswept from comment #22)
> That appears to stop this glitch, too. For testing consistency I also tried
> this with firefox, just resized it to smallest, and left it in the corner.
> Like this, the whole desktop around it, is fine.

This conflicted with my original report, and it bothered me, so I tested more.

Background: part of my firefox theme is a CSS transition with a 12 second duration. It sits there doing nothing for 10s, then animates for 2s. This 'timer' begins when my mouse leaves the ff window.

When I position the ff window on the desktop, resize it to smallest and put it in the corner, move my mouse back-and-forth over the desktop around it... The mouse animation is smooth - as per my reply from today.

But only for those 12 seconds. After the CSS animation (10s no visual change at all, but it is "animating"+2s visually changes) is complete, the mouse begins to stutter again - as per my original report.

So, that's the reason for the conflicting reports. It depends not only on the app being on the same display, but also it actually drawing stuff (even if that 'stuff' is nothing, apparently).

Hopefully this distinction is useful.