| Summary: | the Version 1.9.4 is X-times slower than the Version 1.9.2. especially with lager files | ||
|---|---|---|---|
| Product: | [Applications] kdiff3 | Reporter: | GagoSoft <gagosoft> |
| Component: | application | Assignee: | michael <reeves.87> |
| Status: | RESOLVED FIXED | ||
| Severity: | major | CC: | nyanpasu64, sutupud |
| Priority: | NOR | ||
| Version First Reported In: | 1.9.4 | ||
| Target Milestone: | --- | ||
| Platform: | openSUSE | ||
| OS: | Linux | ||
| Latest Commit: | https://invent.kde.org/sdk/kdiff3/commit/96cc89bec01f14bf5fc980e6ae250ffebbd7164f | Version Fixed/Implemented In: | 1.9.5 |
| Sentry Crash Report: | |||
| Attachments: |
2000 line files with LF and CRLF endings. kdiff3 is fast on LF and slow on CRLF.
Perf trace of kdiff3 slowly loading the CRLF files (593 MB decompressed) |
||
|
Description
GagoSoft
2022-01-13 10:06:47 UTC
Odd not sure why that would be? The code only code change that took place activates for windows line endings not linux style endings. Created attachment 146377 [details] 2000 line files with LF and CRLF endings. kdiff3 is fast on LF and slow on CRLF. I ran into this slowness when I cloned a repo with CRLF endings (not my repo, not my decision), then ran `git difftool` which triggered kdiff3. It turns out that kdiff3 is far slower for CRLF line endings than LF. See the attached file. I ran a perf analysis, showing an unusually high amount of time spent in SourceData::FileData::preprocess(). Hotspot showed that most time was spent on lines lines 668, 664, and 680 (https://invent.kde.org/sdk/kdiff3/-/blob/1.9.4/src/SourceData.cpp#L664-680). I think line 680 is incorrect, since Hotspot says that 95.9% of cycles were spent calling QTextStream::pos(). My guess is that you call QTextStream::pos() O(file length) times, and each one takes O(file length) time to complete. Created attachment 146378 [details]
Perf trace of kdiff3 slowly loading the CRLF files (593 MB decompressed)
Additional observation: kdiff3 spent a long time on `org.kde.kdiff3: "Loading A: /home/nyanpasu64/tmp/kdiff3 slow/crlf/a" and B and C, and was very fast to display a window once C finished loading.
I'm attaching a perf.data if you're interested. The file is absurdly big because I ran perf with `--call-graph=dwarf`. I *think* it will show the correct function names when decompressed, since it's still readable when I reinstalled system kdiff3.
*** Bug 450411 has been marked as a duplicate of this bug. *** Just rewrote the offending section of code. The LF lines would not have triggered the QTextStream::pos() so it does seem to be the culprit. Turns there were odd EOL corner cases that the didn't get detected right
The new code should fix both issues.
case '\r':
if((FileOffset)lastOffset < mDataSize)
{
prevChar = curChar;
curChar = ts.read(1).unicode()[0];
if(curChar == '\n')
{
vOrigDataLineEndStyle.push_back(eLineEndStyleDos);
break;
}
//work around for lack of seek API in QTextStream
skipNextRead = true;
}
//old mac style ending.
vOrigDataLineEndStyle.push_back(eLineEndStyleUndefined);
break;
The speed now should only depend on the time for QTextStream::read.
Git commit 4f14cfb9efd58e1ebe22e1d4e126b779018a21c0 by Michael Reeves. Committed on 23/02/2022 at 21:32. Pushed by mreeves into branch 'master'. Fix EOL detection issues Use QStream::read to read next character for EOL detection Avoid QStream::pos due to severe speed issues Related: bug 450225 FIXED-IN:1.9.5 M +14 -12 src/SourceData.cpp M +87 -1 src/autotests/datareadtest.cpp https://invent.kde.org/sdk/kdiff3/commit/4f14cfb9efd58e1ebe22e1d4e126b779018a21c0 Git commit ac247986d4d24bb28cfc112d58e3d0a808057b1a by Michael Reeves. Committed on 23/02/2022 at 21:35. Pushed by mreeves into tag '1.9.5'. Fix EOL detection issues Use QStream::read to read next character for EOL detection Avoid QStream::pos due to severe speed issues Related: bug 450225 FIXED-IN:1.9.5 M +14 -11 src/SourceData.cpp https://invent.kde.org/sdk/kdiff3/commit/ac247986d4d24bb28cfc112d58e3d0a808057b1a Git commit 96cc89bec01f14bf5fc980e6ae250ffebbd7164f by Michael Reeves. Committed on 23/02/2022 at 22:33. Pushed by mreeves into branch '1.9'. Fix EOL detection issues Use QStream::read to read next character for EOL detection Avoid QStream::pos due to severe speed issues Related: bug 450225 FIXED-IN:1.9.5 M +14 -11 src/SourceData.cpp https://invent.kde.org/sdk/kdiff3/commit/96cc89bec01f14bf5fc980e6ae250ffebbd7164f |