Bug 448353 - the Version 1.9.4 is X-times slower than the Version 1.9.2. especially with lager files
Summary: the Version 1.9.4 is X-times slower than the Version 1.9.2. especially with l...
Status: RESOLVED FIXED
Alias: None
Product: kdiff3
Classification: Applications
Component: application (other bugs)
Version First Reported In: 1.9.4
Platform: openSUSE Linux
: NOR major
Target Milestone: ---
Assignee: michael
URL:
Keywords:
: 450411 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-01-13 10:06 UTC by GagoSoft
Modified: 2022-02-23 22:37 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In: 1.9.5
Sentry Crash Report:


Attachments
2000 line files with LF and CRLF endings. kdiff3 is fast on LF and slow on CRLF. (21.45 KB, application/x-7z-compressed)
2022-02-07 06:17 UTC, nyanpasu64
Details
Perf trace of kdiff3 slowly loading the CRLF files (593 MB decompressed) (3.16 MB, application/x-7z-compressed)
2022-02-07 06:24 UTC, nyanpasu64
Details

Note You need to log in before you can comment on or make changes to this bug.
Description GagoSoft 2022-01-13 10:06:47 UTC
SUMMARY
the Version 1.9.4 is X-times slower than the Version 1.9.2. especially with lager files
Its impossible to use 

STEPS TO REPRODUCE
1. Open two local files of about 2k LOC (120kBytes) with only some changes (in my example there was only 19 changes)
2. Version 1.9.2 takes 0.88 sec to open both; while version 1.9.4 takes 50 seconds
3. 

OBSERVED RESULT
"top" reports about 50 sec of 100% CPU-load
 PID   USER      PR  NI     VIRT     RES          SHR S  %CPU  %MEM   ZEIT+ BEFEHL 
...
 7237 gago      20   0 1233288 127272 104468 R 100,3 0,784   0:53.98 kdiff3_1.9.4                                                     
...

EXPECTED RESULT
open these files in about 1 second

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux:    5.15.12-1-default #1 SMP Wed Dec 29 14:50:16 UTC 2021 (375fcb8) x86_64 x86_64 x86_64 GNU/Linux
OpenSuse Tumbleweed (updated on 12.01.2022)
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: Version 5.89.0
Qt Version: Version 5.15.2 (built against 5.15.2)

ADDITIONAL INFORMATION
getting the "old version" (1.9.2) binary from another OpenSuse results the former speed. So it's not a problem of the installed libraries or other components of the system.
cd /usr/bin/
su
mv kdiff3 kdiff3_1.9.4
mv <otherComputer>/kdiff3 kdiff3_1.9.2
ln -s kdiff3_1.9.2 kdiff3
Comment 1 michael 2022-01-14 18:30:10 UTC
Odd not sure why that would be? The code only code change that took place activates for windows line endings not linux style endings.
Comment 2 nyanpasu64 2022-02-07 06:17:40 UTC
Created attachment 146377 [details]
2000 line files with LF and CRLF endings. kdiff3 is fast on LF and slow on CRLF.

I ran into this slowness when I cloned a repo with CRLF endings (not my repo, not my decision), then ran `git difftool` which triggered kdiff3. It turns out that kdiff3 is far slower for CRLF line endings than LF. See the attached file.

I ran a perf analysis, showing an unusually high amount of time spent in SourceData::FileData::preprocess(). Hotspot showed that most time was spent on lines lines 668, 664, and 680 (https://invent.kde.org/sdk/kdiff3/-/blob/1.9.4/src/SourceData.cpp#L664-680). I think line 680 is incorrect, since Hotspot says that 95.9% of cycles were spent calling QTextStream::pos().

My guess is that you call QTextStream::pos() O(file length) times, and each one takes O(file length) time to complete.
Comment 3 nyanpasu64 2022-02-07 06:24:11 UTC
Created attachment 146378 [details]
Perf trace of kdiff3 slowly loading the CRLF files (593 MB decompressed)

Additional observation: kdiff3 spent a long time on `org.kde.kdiff3: "Loading A: /home/nyanpasu64/tmp/kdiff3 slow/crlf/a" and B and C, and was very fast to display a window once C finished loading.

I'm attaching a perf.data if you're interested. The file is absurdly big because I ran perf with `--call-graph=dwarf`. I *think* it will show the correct function names when decompressed, since it's still readable when I reinstalled system kdiff3.
Comment 4 michael 2022-02-23 20:37:24 UTC
*** Bug 450411 has been marked as a duplicate of this bug. ***
Comment 5 michael 2022-02-23 21:13:19 UTC
Just rewrote the offending section of code. The LF lines would not have triggered the QTextStream::pos() so it does seem to be the culprit. Turns there were odd EOL corner cases that the didn't get detected right 

The new code should fix both issues.

            case '\r':
                if((FileOffset)lastOffset < mDataSize)
                {
                    prevChar = curChar;
                    curChar = ts.read(1).unicode()[0];

                    if(curChar == '\n')
                    {
                        vOrigDataLineEndStyle.push_back(eLineEndStyleDos);
                        break;
                    }
                    //work around for lack of seek API in QTextStream
                    skipNextRead = true;
                }

                //old mac style ending.
                vOrigDataLineEndStyle.push_back(eLineEndStyleUndefined);
                break;

The speed now should only depend on the time for QTextStream::read.
Comment 6 michael 2022-02-23 22:26:48 UTC
Git commit 4f14cfb9efd58e1ebe22e1d4e126b779018a21c0 by Michael Reeves.
Committed on 23/02/2022 at 21:32.
Pushed by mreeves into branch 'master'.

Fix EOL detection issues

Use QStream::read to read next character for EOL detection
Avoid QStream::pos due to severe speed issues
Related: bug 450225
FIXED-IN:1.9.5

M  +14   -12   src/SourceData.cpp
M  +87   -1    src/autotests/datareadtest.cpp

https://invent.kde.org/sdk/kdiff3/commit/4f14cfb9efd58e1ebe22e1d4e126b779018a21c0
Comment 7 michael 2022-02-23 22:27:56 UTC
Git commit ac247986d4d24bb28cfc112d58e3d0a808057b1a by Michael Reeves.
Committed on 23/02/2022 at 21:35.
Pushed by mreeves into tag '1.9.5'.

Fix EOL detection issues

Use QStream::read to read next character for EOL detection
Avoid QStream::pos due to severe speed issues
Related: bug 450225
FIXED-IN:1.9.5

M  +14   -11   src/SourceData.cpp

https://invent.kde.org/sdk/kdiff3/commit/ac247986d4d24bb28cfc112d58e3d0a808057b1a
Comment 8 michael 2022-02-23 22:37:05 UTC
Git commit 96cc89bec01f14bf5fc980e6ae250ffebbd7164f by Michael Reeves.
Committed on 23/02/2022 at 22:33.
Pushed by mreeves into branch '1.9'.

Fix EOL detection issues

Use QStream::read to read next character for EOL detection
Avoid QStream::pos due to severe speed issues
Related: bug 450225
FIXED-IN:1.9.5

M  +14   -11   src/SourceData.cpp

https://invent.kde.org/sdk/kdiff3/commit/96cc89bec01f14bf5fc980e6ae250ffebbd7164f