Bug 396171 - Brush slowdowns on a wide range of mainstream to lowend laptop CPUs, comprehensive test report attached.
Summary: Brush slowdowns on a wide range of mainstream to lowend laptop CPUs, comprehe...
Status: CONFIRMED
Alias: None
Product: krita
Classification: Applications
Component: Brush engines (show other bugs)
Version: 4.1.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Krita Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-04 17:38 UTC by Tyson Tan
Modified: 2018-07-10 16:50 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Krita CPU Tests Spreadsheet (20.09 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-07-10 04:17 UTC, Tyson Tan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tyson Tan 2018-07-04 17:38:16 UTC
Recently I have been testing mid to low-end laptops and all-in-ones for using with Krita in budget-sensitive scenarios. These computers range roughly from $229 to $529. From these tests, I discovered that Krita’s brush system is very selective when it comes to mobile and SoC CPUs. There were huge CPU related slowdowns towards some particular hardware/software settings, while the the CPU can be rather strong but still performed poorly in some cases.



### TEST ################

Fresh Windows 10 with latest drivers and patches installed (well, mostly).
Manjaro KDE 17.1.11 Live USB (Linux 4.14)
Krita 4.1.0 / A4 300 dpi 8 bit / Basic-5_size, Basic-6_detail, Air_brush_soft, Distort_move
Stablizer OFF, Brush Smoothing: None / Basic
When slowdown happened, test again with GPU acceleration OFF.
Wacom Intuos S (Generation 3, CTL-4100)




### RESULT ################

### Keys that triggered slowdowns
1) Basic-6_detail brush preset (slow drawing, or fast drawing with dropped tablet signals)
2) Windows 10 + CPUs before Skylake (6th gen)
3) CPUs with poor single-thread performance

### Good CPUs (Windows and Linux): 
Celeron 3865U (4GB) / N4000 (4GB) / Core m3-6Y30 (4GB) / Core i3-3240 (4GB) / Core i5-7200U (8GB) / Core i5-8250U (8GB)

### Good CPUs (Linux only): 
Core i3-4010U (4GB) / Core i7-3520M (16GB) 

### BAD:
Intel Celeron N3150M (16GB) / N3450 (4GB)
AMD A4-7220 (4GB) / A6-9210 (4GB) / A9-9420 (4GB) / A10-9600P (4GB) / A12-9700P (8GB)



### ANALYZE ################

### The Role of GPU
Turning ON and OFF GPU acceleration have no effect to any slowdowns happened in this test. 

###  Basic-6_detail
This particular preset can slow down lowend CPUs badly. I never changed its size during the tests. David Reevoy suspects: 0.02 spacing and precision to 5, plus Fan corner parameter being resource heavy. It needs at least Core m3-6Y30 to perform nicely.

### Core i7-3520M (Mobile) and  Core i3-3240 (Desktop)
They are the same 3rd Generation of Intel Core I family CPUs. Both have 2 cores and 4 threads. They support the same instructions. On spec, i7-3520M is better in every way, it has 1.3 times larger L3 cache and a 4 times larger GPU scale. There is only one exception: i3-3240 has a base clock of 3.40 Ghz, while i7-3520M has 2.90 Ghz. However, the latter can burst to 3.60 Ghz. On benchmark, Core i3-3240 has marginal better single thread performance (1810 vs 1782). Under Windows 10, the weaker Core i3-3240 handled Krita very well, while the supposedly stronger Core i7-3520M with 4 times larger RAM lagged badly. When it came to Linux, they both performed very well. 

In this case, I suspect Windows 10’s way of handling multi-thread CPU workload. I also suspect the effects from Intel’s Hyper-threading bug / Meltdown / Spectre patches. The Core i7-3520M resides in a Thinkpad X230T, it has Hyper-threading / Meltdown / Spectre patched in its BIOS. The Core i3-3240 resides in a budget all-in-one PC, it never received any related BIOS update.

### Core i3-4010U
This CPU performed poorly under Windows, but performed rather well under Linux. Its Windows performance is slightly better than i7-3520M. It has no right to though, since it has a much lower clock speed (1.70 Ghz) and it has no burst capability. On benchmark, these CPUs’ single-thread ratings are: (i3-4010U:932 / i7-3520M:1782)

In this case, I suspect Windows 10’s CPU driver was optimized towards 3rd gen and 4th gen Intel Core CPUs differently.

### Celeron 3865U and N4000
These two CPUs both have 2 cores and 2 threads, with a base clock of 1.10 Ghz. They have no hyper-threading or burst capabilities. They performed well under Windows in general. I didn’t test them under Linux but I expect them to do well too. 

When handling the brush preset “Basic-6_detail”, they could draw the line in a timely fashion. However, if the strokes were long and quick, big chunks of tablet signals were dropped, resulted in lines that similar to what happened on MacOSX in the past.

### Intel Celeron N3150M and N3450
These two CPUs both have 4 cores and 4 threads. N3150M has a base clock of 1.60Ghz and it can burst to 2.08Ghz. N3450 has a base clock of 1.10 Ghz and it can burst to 2.20 Ghz. They both suffered huge slowdowns under Windows and Linux. They have no rights to though, because compared to the fore-mentioned Celeron 3865U and N4000, they have 2 times more cores and 2 times faster clock.

In this case, I suspect Krita’s single thread performance sensitivity. Although N3150M and N3450 have 2 times more cores and 2 times faster clock speeds compared to 3865U and N4000, the 3865U is a tick-tock newer, while N4000 is a tick-tock-tick newer. On benchmark, these CPUs’ single-thread ratings were: (N3150M:471 / N3450:727 / 3865U:1030 / N4000:1123). Maybe the newer 3865U and N4000 also match Krita’s optimization better than the older CPUs.



### AMD A4-7220 / A6-9210 / A9-9420 / A10-9600P / A12-9700P
The whole family of AMD’s APU performed rather poorly with brush preset “Basic-6_detail”, regardless of Windows or Linux. Otherwise they were still usable, performed slightly slower than 3865U. The frequent slowdowns made me hate drawing with them though.

In this case, I suspect Krita’s single thread performance sensitivity. On benchmark, these CPUs’ single-thread ratings were (A4:832 / A6:1193 / A9:1403 / A10:1300 / A12:1315). On paper, they had no right to perform this badly. But when you take Krita’s optimization into consideration, plus David Reevoy’s suspect of Vc library on this matter, it’s possible that it has 25% negative impact on AMD CPUs’ performance in general.



### CONCLUSION ################
It should be safe to suggest a few bottom lines for CPUs for Krita:
SoC: Celeron N4000 or better.
Normal: Core m3-6Y30 or better.
Desktop: Celeron G1620 or better.

Ideally: Intel Core i5-8250U / AMD Ryzen 5 2500U. 
You can find laptops with one of these two CPUs installed at the price range of $399 to $529. They are 2 times faster than the ones before them, and usually pair with 8GB of RAM. They are much better choices for Krita’s tasks.

I hope we can also improve Krita's performance against the slowdowns I reported here. I encountered many artists that must use Krita on older, weaker, less-than-ideal laptops.
Comment 1 Tyson Tan 2018-07-07 09:54:42 UTC
I have tested Krita 4.1.0 on two latest Macbooks today, but none of them are working well despite of their relatively high hardware specs. I don't know whether we have plan to optimize Krita on macOS in the future, but since the performance was alarmingly bad, I felt the need to report the detail as well.

### TEST ################
Macbook (Retina, 12-inch, 2017), Intel Core m3-7Y32 (1.2Ghz) 8GB 1866Mhz RAM.
Macbook Pro (13-inch, 2017, 2 Thunderbolt 3 ports), Intel core i5-7360U (2.3Ghz) 8GB 2133Mhz RAM.

Both Macbooks were running macOS High Sierra 10.13.2.
Krita 4.1.0.dmg
Wacom Intuos S Gen 3 CTL-4100 with driver 6.3.30-2 installed.

### SYMPTOMS ################
Brush preset "Basic-6_detail" is lagging behind, the drawing speed is even worse than Celeron N3450 and APUs under Windows 10. Long and quick strokes crawls far behind for seconds before they can be finished, it's simply unusable. 

Brush preset "Basic-5_size" and other less resource intense presets are more responsive, but the tablet signal is clearly being dropped heavily, resulting in undesired straight lines when drawing curves. Krita 4.1.0's release notes said it had fixed this problem, but it appeared to be otherwise during my tests.

Reminder: Core m3-6Y30 under Windows 10 can handle these without breaking a sweat.

### MORE  ################
I've also tested Pentium N4200 8GB under Windows 10. It was as slow as its brother Celeron N3450. The only difference being its burst frequency of 2.5Ghz, but it appeared to be irrelevant during the test. The result made me suspect optimization issues even more.
Comment 2 Halla Rempt 2018-07-09 11:50:36 UTC
Hm... I'm not sure what to do. This information probably should be in some spreadsheet kind of format, to make it easier to understand correlations. And there's another thing that might make a difference: memory speed. 

It might even be a good idea to make a standalone version of the brush benchmark and use that so we can eliminate the canvas factor altogether.
Comment 3 Tyson Tan 2018-07-09 17:01:08 UTC
The reason I didn't use a spreadsheet was because it would have revealed nothing useful. I can provide you with one tomorrow, but I will give a abridged version of the bizarre results I have found:

1A) Hardware specs and Benchmark result do not match Krita's actual performance.

1B) You'd be surprised how many machines that could not handle Krita actually had 2016/2017 mainstream grade CPUs with DDR4-2133 8GB dual channel RAM installed. 

1C) But there was a Core i3-3240 from 2012 with a single DDR3-1600 4GB RAM that handled Krita super fast. And let's don't forget the super weak 2 core 2 thread Celeron 3865U with a single DDR3-1866 4GB RAM, Krita liked it too!

2A) Same generation, better CPU, faster single-thread and multi-thread benchmark, faster and larger RAM, could not handle Krita. (Core i7-3520M DDR3-1866 8GBx2)

2B) Same generation, worse CPU, slower single-thread and multi-thread benchmark, slower and smaller RAM, handled Krita very well. (Core i3-3240 DDR3-1600 4GBx1)

3) Same CPU (Core i7-3520M). Could not handle Krita under Windows. But it handled Krita very well under Linux.

4) Desktop CPU tend to work better than Laptop CPU, even when the Laptop has better CPU, RAM and actual benchmarks.

5) None of AMD's Laptop APUs could handle Krita. Not even on an performance laptop features A12.

6) None of the latest MacBook/Pros could handle Krita under macOS. But under Windows 10, a lower CPU from the same CPU family handled Krita very well.

7) Most machines were plugged in when being tested.

8) When slowdown happened, tablet signal drops were clearly observed under all OS.
Comment 4 Tyson Tan 2018-07-10 04:17:26 UTC
Created attachment 113860 [details]
Krita CPU Tests Spreadsheet
Comment 5 Halla Rempt 2018-07-10 13:14:25 UTC
I'll set the status to confirmed, though I guess we should make a phabricator task with this data and then somehow figure out what is up...
Comment 6 Tyson Tan 2018-07-10 16:50:18 UTC
(In reply to Boudewijn Rempt from comment #5)
> I'll set the status to confirmed, though I guess we should make a
> phabricator task with this data and then somehow figure out what is up...

OK! If you need other information, I will try to provide them as well. Although I don't own all the machines I've tested, I can always have my hands on the following "bad" ones: Core i7-3520M, i3-4010U, Celeron N3150.