Bug 478250 - Sorting is very slow
Summary: Sorting is very slow
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: scripting (show other bugs)
Version: 23.04.3
Platform: Arch Linux Linux
: NOR minor
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-08 09:20 UTC by Dan
Modified: 2024-08-24 17:14 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 24.02.0
Sentry Crash Report:


Attachments
Testfile (1.24 MB, text/plain)
2024-08-20 12:31 UTC, Waqar Ahmed
Details
The sample file sorting very slow ... (466.53 KB, application/x-bzip)
2024-08-21 09:08 UTC, Dan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dan 2023-12-08 09:20:48 UTC
SUMMARY
***
The sorting plugins:
  Tool->Scripts->Editing->Sort Selected Text ... 
                                      ...-> Remove Duplicates and sort ...)
are *very* slow.

What sort can make is seconds, kate/kwrite need hour(s) to achieve.
***


STEPS TO REPRODUCE
1. Create very large text file (50MB+)
2. Make it sort (and unique)
3. Compare times with `sort -u` on the same file

OBSERVED RESULT

Kate is not responding for a very long time

EXPECTED RESULT

*Much* faster result

SOFTWARE/OS VERSIONS
KDE Plasma Version: 5.27.7
KDE Frameworks Version 5.108.0
Qt Version: 5.15.10 (built against 5.15.10)
Comment 1 Waqar Ahmed 2023-12-14 08:18:56 UTC
Hi, would it be possible for you to try the latest Kate? You seem to be on Arch, so the latest version should be available
Comment 2 Dan 2023-12-15 08:39:21 UTC
Hi, thanks for the reaction. Will try in a few days (need update but not interrupt the work ...)
Dan.
Comment 3 Bug Janitor Service 2023-12-20 07:55:37 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/ktexteditor/-/merge_requests/649
Comment 4 Christoph Cullmann 2023-12-20 17:19:37 UTC
Git commit e63590e005d8c7761e71f5b392dd7bbc121131ef by Christoph Cullmann, on behalf of Waqar Ahmed.
Committed on 20/12/2023 at 14:40.
Pushed by cullmann into branch 'master'.

Optimize textInsert/setText

TLDR; accessibility stuff is super slow, try do it less.

When inserting text our document emits two signals, a public one and an
internal one. Accessibility listens to the internal signal i.e.,
textInsertedRange and then notifies any listening screen readers that
new text is available.

The way our code was working lead to about 3-4 textInsertedRange per
line which slowed things down to a crawl. I imagine it would also be a
horrible experience for a screen reader which doesn't need to know about
how the text editor is behaving internally.

With this change, all these signal emissions are removed and we emit
one fat signal that reports what text was inserted.

For me this brings down timings from 10 seconds to about 1.5 seconds.

M  +31   -13   src/document/katedocument.cpp
M  +3    -3    src/document/katedocument.h

https://invent.kde.org/frameworks/ktexteditor/-/commit/e63590e005d8c7761e71f5b392dd7bbc121131ef
Comment 5 Dan 2024-01-02 09:00:42 UTC
(In reply to Waqar Ahmed from comment #1)
> Hi, would it be possible for you to try the latest Kate? You seem to be on
> Arch, so the latest version should be available

Hi,
I have tried it with the latest version available in Arch:
* kate 23.08.4
* KDE Frameworks 5.113.0
* Qt 5.15.11

*Still too slow*:
* sort ~ 2sec
* kate >> minute
Comment 6 Kåre Särs 2024-01-02 09:20:16 UTC
Hi, 

This has been improved in master and will be released with 24.02.0 in February.

If you need the improvement _now_, you have to compile your self or use some bleeding development edge distribution that packages beta versions of KDE software.

Instructions for building it your self:
https://kate-editor.org/build-it/

Br,
  Kåre
Comment 7 Dan 2024-01-02 09:29:16 UTC
(In reply to Kåre Särs from comment #6)
> Hi, 
> 
> This has been improved in master and will be released with 24.02.0 in
> February.
> 
> If you need the improvement _now_, you have to compile your self or use some
> bleeding development edge distribution that packages beta versions of KDE
> software.
> 
> Instructions for building it your self:
> https://kate-editor.org/build-it/
> 
> Br,
>   Kåre

Hi, no need to hurry. Thank you for the info.
Comment 8 Waqar Ahmed 2024-01-02 09:43:26 UTC
Most of the slowdown is because of accessibility. You can try to turn it off in your system. As per the docs https://doc.qt.io/qt-6/qaccessible.html:

> In the Unix/X11 AT-SPI implementation, applications become accessible when two conditions are met:
>    org.a11y.Status.IsEnabled DBus property is true
>    org.a11y.Status.ScreenReaderEnabled DBus property is true
> An alternative to setting the DBus AT-SPI properties is to set the QT_LINUX_ACCESSIBILITY_ALWAYS_ON environment variable.

In master, we allow disabling it in settings and have tried to make accessibility faster in general.
Comment 9 Dan 2024-08-19 08:59:32 UTC
Hi,
I have tried it in:

* kate 24.05.1
* KDE Frameworks 6.3.0
* Qt 6.7.2

Still too slow (take >> minute).

Could you please elaborate on the "accessibility" a bit more? How do I disable it for Kate?
Thank you very much,
D.
Comment 10 Waqar Ahmed 2024-08-19 14:32:25 UTC
Its the last option in "Editing -> General -> Enable accessibility notifications". 

Also can you please share:
- number of lines in the target file
- the exact action/command that you execute for sorting. And are you using the command line (F7)?
Comment 11 Dan 2024-08-20 10:19:27 UTC
Hi,
I have disabled the "accessibility notifications" (thanks for pointing it out), but it is still too slow (it was not measured but is about the same as before ...). 

The file has 201350 lines, none of them longer than 100 characters.
I use the menu: Tools->Scripts->Edit->Sort selected text alphabetically (I hope the translation is correct; I have it localized). The whole file is selected.
I am not using command line.

Thank you.
D.
Comment 12 Waqar Ahmed 2024-08-20 12:31:37 UTC
Created attachment 172773 [details]
Testfile

I tried "Sort Selected Text Alphabetically" with the attached file. It finishes in a couple of seconds.

But if I try "Remove Duplicates and Sort Selected Text Alphabetically" it takes too long. Perhaps you are using this action?
Comment 13 Dan 2024-08-21 09:08:54 UTC
Created attachment 172802 [details]
The sample file sorting very slow ...
Comment 14 Dan 2024-08-21 09:10:03 UTC
OK, the file you sent is fast. Also, other files are fast. But I have created a file sorting slowly on my system.  See the attached file (sample.bzip).
D.

P.S. I really use "Sort Selected Text"
Comment 15 Christoph Cullmann 2024-08-24 17:14:11 UTC
Git commit 6f539d95acc3849b5f72683cb79bbc4e5f27cdb1 by Christoph Cullmann, on behalf of Waqar Ahmed.
Committed on 24/08/2024 at 17:08.
Pushed by cullmann into branch 'master'.

Move the sortuniq, uniq implementation to C++

It is not possible to implement this efficiently in Javascript atm. The
Set (and Map) classes provided by Qt are super slow, they are not really
even hashmaps to begin with and thus they slow things down too much when
N is slightly big.

M  +1    -39   src/script/data/commands/utils.js
M  +17   -0    src/script/katescriptaction.cpp
M  +63   -0    src/utils/katecmds.cpp
M  +30   -0    src/utils/katecmds.h
M  +1    -0    src/utils/kateglobal.cpp
M  +1    -1    src/utils/kateglobal.h

https://invent.kde.org/frameworks/ktexteditor/-/commit/6f539d95acc3849b5f72683cb79bbc4e5f27cdb1
Comment 16 Christoph Cullmann 2024-08-24 17:14:19 UTC
Git commit af729fad54f15d61ee281fea4c74575a40cebd29 by Christoph Cullmann, on behalf of Waqar Ahmed.
Committed on 24/08/2024 at 17:08.
Pushed by cullmann into branch 'master'.

Move sort implementation to C++

If the strings are a bit long the JS implementation has garbage performance.

M  +1    -10   src/script/data/commands/utils.js
M  +13   -3    src/utils/katecmds.cpp
M  +5    -0    src/utils/katecmds.h
M  +1    -0    src/vimode/emulatedcommandbar/commandmode.cpp

https://invent.kde.org/frameworks/ktexteditor/-/commit/af729fad54f15d61ee281fea4c74575a40cebd29