Bug 462220 - Kate's search & replace interprets captured text in regex mode
Summary: Kate's search & replace interprets captured text in regex mode
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: search (show other bugs)
Version: 21.12.3
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-24 22:08 UTC by Olivier Delande
Modified: 2022-11-28 10:08 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 22.12
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Delande 2022-11-24 22:08:12 UTC
SUMMARY

When using the Search and Replace tab in regex mode, Kate interprets regex escape sequences not only in the searched pattern and replacement string, but also in the captured text. The simpler "replace" interface (accessible through Edit/Replace...) does not seem to be affected.

STEPS TO REPRODUCE
1. Create a document with the following contents "\t" (without the double quotes).
2. Open the Search and Replace tab.
3. Enter "^.*$" (without the double quotes) in the "Find" field.
4. Enter "\0" (without the double quotes) in the "Replace" field.
5. Check "Enable regular expressions".
6. Click "Search", then "Replace Checked".

OBSERVED RESULT

The contents of the document are replaced with a tab character.

EXPECTED RESULT

The contents of the document should not change.

SOFTWARE/OS VERSIONS
Ubuntu 22.04.1 LTS
KDE Frameworks: 5.92.0
Qt: 5.15.3 (built against 5.15.3)

ADDITIONAL INFORMATION
Kate 23.03.70 (installed via Flatpak) is also affected, but the "Version" dropdown in this bug report form does not list that version.
Comment 1 Olivier Delande 2022-11-25 10:45:55 UTC
I am not familiar with Kate's codebase, but my impression is that the bug is in MatchModel::generateReplaceString, in addons/search/MatchModel.cpp.

The method computes the replacement text from the replacement pattern by applying various substitutions (for \0, \{0}, \L\0, \t, \n, etc.) sequentially rather than in parallel, with the unfortunate consequence that substituted text is subjected to the subsequent substitutions. The author seems to have anticipated a related issue, by temporarily replacing all occurrences of "\\" with "¤Search&Replace¤", probably to handle cases like "\\0".

It seems to me that a correct implementation would be to build the replacement text incrementally, by iterating through the replacement pattern in one pass, interpreting each special sequence as we go.
Comment 2 Kåre Särs 2022-11-25 12:52:29 UTC
Thanks for the report!

It took me a little while to understand the problem, but I can confirm this issue

In char string terms: In stead of replacing "\\t" with "\\t", as it should, it replaces it with '\t'
Comment 3 Kåre Särs 2022-11-27 10:54:32 UTC
Git commit 8bd136ad5c1c05a92ce91286765b20538a9abad9 by Kåre Särs.
Committed on 27/11/2022 at 10:53.
Pushed by sars into branch 'master'.

S&R: Do not do "\t" -> '\t' and similar for regex captures

M  +10   -6    addons/search/MatchModel.cpp

https://invent.kde.org/utilities/kate/commit/8bd136ad5c1c05a92ce91286765b20538a9abad9
Comment 4 Kåre Särs 2022-11-27 10:56:19 UTC
Git commit 554df48b3d35ce0c9e552016627106f860dafc5f by Kåre Särs.
Committed on 27/11/2022 at 10:55.
Pushed by sars into branch 'release/22.12'.

S&R: Do not do "\t" -> '\t' and similar for regex captures
FIXED-IN: 22.12

M  +10   -6    addons/search/MatchModel.cpp

https://invent.kde.org/utilities/kate/commit/554df48b3d35ce0c9e552016627106f860dafc5f
Comment 5 Olivier Delande 2022-11-27 20:33:03 UTC
Thanks, the patch looks like it will fix the issue. The one issue that remains with this implementation is that the string "¤Search&Replace¤" is reserved: all occurrences of this string (after replacement) are replaced with "\".
Comment 6 Kåre Särs 2022-11-28 10:08:15 UTC
Yes, "¤Search&Replace¤" is a reserved string, but it has been like that for over 9 years and I have not seen a bug report about it yet ;)