| Summary: | not greedy regular expression | ||
|---|---|---|---|
| Product: | [Applications] kate | Reporter: | Un1c0 <un1c0> |
| Component: | kwrite | Assignee: | KWrite Developers <kwrite-bugs-null> |
| Status: | RESOLVED NOT A BUG | ||
| Severity: | wishlist | CC: | edward.81, jgardynik, kare.sars, kdedevel, kubry, lasse.liehu, stefanprobst, webmaster |
| Priority: | NOR | ||
| Version First Reported In: | unspecified | ||
| Target Milestone: | --- | ||
| Platform: | unspecified | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
|
Description
Un1c0
2007-06-01 10:23:24 UTC
The way QRegExp is dealing with non-greedy quantifiers is very unusual and bad in my eyes. People are used to make that decision per quantifier, not "globally" for the whole pattern. No idea why it was implemented that way. So I think it's rather unlikely to make a checkbox for that since we would further push this bad concept in that case. Sebastian Otoh, if you have rather simple regexps like a.*b it still would work as expected, right? I don't see how complexity of the pattern affects greedyness. With the text "aaaaabaaaaab" your pattern "a.*b" should match "[aaaaabaaaaab]" with greedy on and "[aaaaab]aaaaab" with greedy off. Is that what you meant? yes Yes, I would like the ability to have non-greedy matches in find/replace too. Right now, you can't even decide non-greedy per quantifier. value='(.*?)' This causes the find button to gray out. Removing the ? restores the button. So it looks like it doesn't understand the concept of a non-greedy search at all. (In reply to comment #6) > So it looks like it doesn't understand the concept of a non-greedy search at all. True. Getting this feature in can come from either: - A feature extension upstream at QRegExp - A switch of regex library at Kate level So it looks like the option is either: - Implement setMinimal() as a checkbox - Bug QT enough to support the standards for regexp with regards to .*? - Switch to a more standard regexp library, which would introduce code bloat by adding another library. Personally, I'd implement option setMinimal() and bug the QT devs to implement non-greedy searches in their patterns. That way kate has *something* that can do non-greedy, with the hope of eventually being able to do it by quantifier. Just a note, it does look like it's been a feature request in QT for a *long* time. http://www.qtsoftware.com/developer/task-tracker/index_html?id=116127&method=entry I guess the QT developers don't consider supporting non-standard modifiers very important, even though practically every other library/program does. We might be able to revisit this after the port to Qt5: http://qt-project.org/doc/qt-5.0/qtcore/qregularexpression.html Information that is less important:
> Just a note, it does look like it's been a feature request in QT for a *long* time.
> http://www.qtsoftware.com/developer/task-tracker/index_html?id=116127&method=entry
> I guess the QT developers don't consider supporting non-standard modifiers very important, even though practically every other library/program does.
That feature request is now found in:
https://bugreports.qt-project.org/browse/QTBUG-130
It's briefly mentioned in:
http://qt-project.org/wiki/Regexp_engine_in_Qt5
A use case:
John is working at a company where the names of some files have changed. For example, the suffix "-public" has been added to a lot of files, and for example, instead of "default.css", the file is now "default-public.css", and instead of "custom.jpg", the file is now "custom-public.jpg". John has to change a lot of web pages, and is using Kate to do the changes in them.
John uses Kate to search for
folder_files/(.*)\.
and to replace it with
folder_files/\1-public\.
That works with lines like
<link href="folder_files/default.css" rel="stylesheet" type="text/css">
but it doesn't work with lines like
<link href="folder_files/default.css" rel="stylesheet" type="text/css"><link href="folder_files/custom_styles.css" rel="stylesheet" type="text/css">
because the greedy regular expresion doesn't stop at the first dot, but at the last dot it can. That is to say, the text that Kate tries to replace in the last case is
folder_files/default.css" rel="stylesheet" type="text/css"><link href="folder_files/custom_styles.
instead of
folder_files/default.
Then John tries to use a non-greedy regular expresion,
folder_files/(.*?)\.
but John sees that then the "find" button of Kate is disabled.
Information that is less important:
Another use case of a "non-greedy", "regular expression to stop at first match", can be seen in:
https://stackoverflow.com/questions/2503413/regular-expression-to-stop-at-first-match
> We might be able to revisit this after the port to Qt5: http://qt-project.org/doc/qt-5.0/qtcore/qregularexpression.html Good comment. The QRegularExpression seems to improve the situation so that this Kate bug can be solved (instead of seeing disabled buttons, as [said in comment 6](https://bugs.kde.org/show_bug.cgi?id=146239#c6)). I've made a little C++11 Qt5 sample: #include <QTextStream> #include <QRegularExpression> static QTextStream cin(stdin, QIODevice::ReadOnly); static QTextStream cout(stdout, QIODevice::WriteOnly); static QTextStream cerr(stderr, QIODevice::WriteOnly); int main() { QRegularExpression greedyReguExp(R"(folder_files/(.*)\.)"); QRegularExpression nonGreedyReguExp(R"(folder_files/(.*?)\.)"); QString inputText = R"( <link href="folder_files/default.css" rel="stylesheet" type="text/css"><link href="folder_files/custom_styles.css" rel="stylesheet" type="text/css"> )"; cout << "Using a greedy regular expression:\n" << greedyReguExp.match(inputText).captured(); cout << "\n\nUsing a non-greedy regular expression:\n" << nonGreedyReguExp.match(inputText).captured() << endl; } and it shows: Using a greedy regular expression: folder_files/default.css" rel="stylesheet" type="text/css"><link href="folder_files/custom_styles. Using a non-greedy regular expression: folder_files/default. So it looks like that with QRegularExpression the problem can be solved (non-greedy regular expresions can be used, Kate doesn't have to disable the buttons to its users). Git commit cf4dd59f4b16f226b04679703df13efa20edfb67 by Kåre Särs. Committed on 22/06/2015 at 19:14. Pushed by sars into branch 'master'. Port the Search plugin to use QRegularExpression in stead of QRegExp This makes the search plugin able to use non-greedy regular expressions. M +17 -12 addons/search/SearchDiskFiles.cpp M +7 -7 addons/search/SearchDiskFiles.h M +39 -22 addons/search/plugin_search.cpp M +2 -1 addons/search/plugin_search.h M +7 -5 addons/search/replace_matches.cpp M +3 -3 addons/search/replace_matches.h M +19 -14 addons/search/search_open_files.cpp M +6 -6 addons/search/search_open_files.h http://commits.kde.org/kate/cf4dd59f4b16f226b04679703df13efa20edfb67 Note that this is not the built in seach, but the seach-plugin. -> not closed *** This bug has been confirmed by popular vote. *** Dear user, this wish list item is now closed, as it wasn't touched in the last year and no contributor stepped up to implement it. The Kate/KTextEditor team is small and we can just try to keep up with fixing bugs. Therefore wishes that show no activity for a years or more will be closed from now on to keep at least a bit overview about 'current' wishs of the users. If you want your feature to be implemented, please step up to provide some patch for it. If you think it is really needed, you can reopen your request, but keep in mind, if no new good arguments are made and no people get attracted to help out to implement it, it will expire in a year again. We have a nice website https://kate-editor.org that provides all the information needed to contribute, please make use of it. Patches can be handed in via https://phabricator.kde.org/differential/ Greetings Christoph Cullmann |