Version: 2.4 (using KDE 3.3.89 (CVS >= 20041129), compiled sources) Compiler: gcc version 3.3.3 (SuSE Linux) OS: Linux (i686) release 2.6.5-7.111.30-default Type something like \begin{document} and search backwards for the regular expression '\\begin\{([A-Za-z]+)\}'. The LaTeX environment will be found, when it starts in column 0. In all other cases it won't be found. That is it will find \begin{document} but it won't find test\begin{document} Forward search works in all cases. It is very important for Kile to get this to work again, hopefully before KDE 3.4. best, Jeroen
Note that KTextEditor::SearchInterface::searchText(...) suffers from the same problem. So perhaps the bug is in the part.
On Wednesday 23 February 2005 19:11, Jeroen Wijnhout wrote: > It is very important for Kile to get this to work again, hopefully before > KDE 3.4. It works completely fine here. Anyone else? -anders
There is no KDE_3_4 branch yet, right? So we must be using the same version. Unless the Qt version matters. I will do some more testing tomorrow.
I have Qt 3.3.4. in KDE 3.3.2 (Gentoo) it does not work. in HEAD it *works* for me.
On Wednesday 23 February 2005 20:29, Jeroen Wijnhout wrote: > There is no KDE_3_4 branch yet, right? So we must be using the same > version. Unless the Qt version matters. I will do some more testing > tomorrow. I'm using cvs HEAD which is unchanged since I committed my last changes yesterday. The Qt version should not matter. I have no idea what the problem was, or when it was fixed, even it is probably done by me. Maybe I can find it in the commit log. But it should work in cvs HEAD no matter what. -anders
I can confirm that it works in HEAD (my build system was a bit messed up, fixed that now). So the bug is only in KDE 3.3.x. I would appreciate if you could give a hint on how to workaround this, since Kile should work with KDE 3.2,3.3 and 3.4. best, Jeroen
On Thursday 24 February 2005 11:11, Jeroen Wijnhout wrote: > I would appreciate if you could give a hint on how to workaround this, > since Kile should work with KDE 3.2,3.3 and 3.4. Well, it seems to be a bug. I can't see how it can be worked around, except from fixing the code. But we don't release those branches, so even after doing that, it would require your users built from the branch. The change that does the difference is in kdelibs/kate/part/katetextline.cpp. The following diff is the result of me fixing the issue for regex search in hte 3.3 branch locally: Index: katetextline.cpp =================================================================== RCS file: /home/kde/kdelibs/kate/part/katetextline.cpp,v retrieving revision 1.77.4.4 diff -u -b -r1.77.4.4 katetextline.cpp --- katetextline.cpp 28 Sep 2004 08:21:10 -0000 1.77.4.4 +++ katetextline.cpp 24 Feb 2005 11:25:58 -0000 @@ -253,6 +253,9 @@ if (backwards) { int col = startCol; + + // allow finding the string ending at eol + if ( col == m_text.length() ) startCol++; do { index = regexp.searchRev (m_text, col); col--; There is a similar fix for text search, which should be applied as well to backport the whole fix, it's marked with the same comment in katetextline.cpp in HEAD. The methods to fix are both versions of KateTextLine::searchText(). Would it be of any value to you if I backported the fix to both 3_3 and 3_2 branches? -anders
On Thursday 24 February 2005 12:32 pm, Anders Lund wrote: > There is a similar fix for text search, which should be applied as well to > backport the whole fix, it's marked with the same comment in > katetextline.cpp in HEAD. The methods to fix are both versions of > KateTextLine::searchText(). > > Would it be of any value to you if I backported the fix to both 3_3 and 3_2 > branches? No, don't bother. At least now I know where it is coming from. Guess I will have to encourage KDE 3.3 users to upgrade to KDE 3.4 ;-) Thanks for looking into it. best, Jeroen
QRegExp::searchRev() does not _have_ a bug, the function _is_ itself a bug and should be avoided altogether (explaination below, sorry if it is a bit lengthy). QRegExp::searchRev() confuses a lot of people, who wrongly think that is searches a regexp backwards from the end of a string, and therefore is symmetrical to QRegExp::search(). This is *NOT* what QRegExp::searchRev() does. What it does is both complicated and counterintuitive, two reasons making it worse than just useless. Here is how I learned exactly how QRegExp::searchRev() works. I filed a bug to Trolltech about this in December ([Issue N62377]) because of another bug: http://bugs.kde.org/show_bug.cgi?id=94726 > Hello, > > Qt version: qt-3.3.3 > Compiled from sources. > Plarform: Linux, i686 > Compiler: gcc-3.4.2 (Fedora Core 3) and gcc-3.3.3 (Fedora Core 1) > > I think that QRegExp::searchRev() has a bug, demonstrated by the following > code: > > #include <iostream> > #include <qregexp.h> > #include <qstring.h> > > using namespace std; > > int main (int argc, char *argv[]) { > QString name = "foo123.png"; > QRegExp rx("\\d+"); > > int i = rx.search(name); > int j = rx.matchedLength(); > > cout << "search i=" << i << " j=" << j << endl; > > i = rx.searchRev(name); > j = rx.matchedLength(); > > cout << "searchRev i=" << i << " j=" << j << endl; > > } > > $ ./test_regexp > search i=3 j=3 > searchRev i=5 j=1 > > It shows that the QRegExp::searchRev() function matches only the last digit > of the regexp ("3") instead of the whole thing ("123"). > QRegExp::search() behaves correctly (i.e matches "123"). > And here is their answer: > Hi Philippe, > > On Sunday, 12. Dec 2004 00:06 Philippe Rigault wrote: > > [snip] > > > It shows that the QRegExp::searchRev() function matches only the last > > digit of the regexp ("3") instead of the whole thing ("123"). > > QRegExp::search() behaves correctly (i.e matches "123"). > > This is actually by design, QRegExp::searchRev() will still search > forward, but it will just start at the end, and work it's way > backwards. > > Hope this helps. > > -- > Jan Erik Hanssen > Trolltech AS, Waldemar Thranes gate 98, NO-0175 Oslo, Norway You can now figure out what QRegExp::searchRev() does in plain english: Given a string, it searches _forward_ for a regexp in successive substrings consisting first of the last character of the string, then the last two characters, and so on. Contrast it with QRegExp::search(), which also searches forward for a regexp in successive substrings, but these substrings consist first of the entire string, then the substring starting at the second character, and so on. This design of QRegExp::searchRev(), in addition to being counterintuitive, poses a major problem with regards to greediness, as I objected to Trolltech: > Hi Jan, > > Thank you for your rapid response. > > This renders searchRev() directly incompatible with the very notion of > greediness in regular expressions with quantifiers. > What people expect is that: > - with setMinimal(FALSE), the search is greedy > - with setMinimal(TRUE), the search is notgreedy > > Your definition of QRegExp::searchRev() _guarantees_ that the search can > never be greedy, and that setMinimal() has no effect on > QRegExp::searchRev(), which is unlike any other function and absolutely > counterintuitive. > > If this is really the case (and then QRegExp::searchRev() would be a pretty > useless function because it cannot use quantifiers properly), this needs to > be _heavily_ documented. > > Now I think that the design part is the bug. > > Best regards, My opinion is therefore that QRegExp::searchRev() should always be avoided: - because in most cases, a proper use of QRegExp::search() can do the job (for example, if one really wants the symmetrical action of QRegExp::search(), then use it on the reversed string) - because it is incompatible with greediness, as demonstrated above - because each inclusion of this function in the code would require _heavy_ warning comments to remind people that this is not a straightforward function. Best regards, Philippe Rigault
On Friday 25 February 2005 15:33, Philippe Rigault wrote: > What it does is both complicated and counterintuitive, two reasons making > it worse than just useless. I'm not sure why you suddenly attack qregexp. The problem is solved, it does work in current kate. -anders
> I'm not sure why you suddenly attack qregexp. The problem is solved, it does > work in current kate. Please read my comment correctly. I am _encouraging_ people to use qregexp correctly by avoiding the only one of its member functions (searchRev) that is problematic, and I have explained why. I wish people used regexes more, and more correctly :-) The whole problem with this bogus function is that it gives the false impression that it brings useful functionality, whereas in fact it dramatically contrasts with some fundamental properties expected by regexp users, greediness being a _very_ fundamental one. For example, you will never by able with this function to capture something as simple as the last consecutive string of digits in a string (something easily done with search()). How is that for a regexp member function called searchRev() ? Problem solved or not for this bug, in my opinion any use of QRegExp::searchRev() is a time bomb, ready to cause havoc by the mere addition of quantifiers (because it scans forward). People start using it with a trivial regexp just because it avoids a loop or crafting a correct regexp, but when the regexp becomes a real one, with quantifiers and greediness, bugs will inevitably arise because QRegExp::searchRev() by its design does not work as a classical regexp engine, or as its name would suggest. Fortunately, I have seen in the whole KDE only a few uses of QRegExp::searchRev(), all of them easily replaceable with search(), but all of them will cause problems in the future should the quantifiers require greediness.
You need to come up with a suggestion as to what to do instead. And why doesn't trolltech fix the bug in their code instead of us having to work around it?
On Friday 25 February 2005 14:33, Philippe Rigault wrote: > QRegExp::searchRev() does not _have_ a bug, the function _is_ > itself a bug and should be avoided altogether (explaination below, sorry if > it is a bit lengthy). [...] I do agree that Kate's behaviour is incorrect here, but your objection to searchRev() doesn't seem right to me. Imagine I have the following regular expression: abcdef|[b-e]+ And I'm matching against the string: foobarabcdefg Now, where search() match this? Well, it matches at (in the order that the string is found): start:end (matching letters in capitals) 3:4 (fooBarabcdefg) 6:12 (foobarABCDEFg) 7:11 (foobaraBCDEfg) 8:11 (foobarabCDEfg) 9:11 (foobarabcDEfg) 10:11 (foobarabcdEfg) Right? So, what can a reverse search for this regexp possibly do? If you want it symmetric with search(), searchRev() *must* find 10:11 first, like it currently does. Then it must find 9:11. Then 8:11. And so on. These are the latest-starting greedy matches. If you want the latest-finishing greedy matches, it'd have to match in this order: 6:12 (foobarABCDEFg) 7:11 (foobaraBCDEfg) 7:10 (foobaraBCDefg) 7:9 (foobaraBCdefg) 7:8 (foobaraBcdefg) 3:4 (fooBarabcdefg) which is not a reverse search: the starting point of the match is sometimes moving *forwards*: it finds 'abcdef' then finds 'bcde'. And it's certainly not symmetric with search(): it finds different matches! How would you like reverse search with regexps to work at the Qt API level? Now, how Kate deals with regexp searching is another matter. Using the same regexp and search string, it matches: 3:4 (fooBarabcdefg) 6:12 (foobarABCDEFg) and no other places. Reverse search should clearly find the same things in the opposite order, so should match: 6:12 (foobarABCDEFg) 3:4 (fooBarabcdefg) but actually matches: 10:11 (foobarabcdEfg) 6:12 (fooBarabcdefg) Interestingly, in this case, the last-finishing greedy matches are exactly what's wanted here: if we look for nonintersecting matches backwards using last-finishing matching, we match at 6:12 then at 3:4, like we should (and this works in general too). Unfortunately, Qt doesn't provide a function to do this kind of searching, which makes it non-trivial for Kate to implement. There is a way, however (but it is nasty to implement): you can reverse the regexp (which is hard) and reverse the search string (which is much easier): Regexp: [b-e]+|fedcba string: gfedcbaraboof With these, Kate now matches: gFEDCBAraboof gfedcbaraBoof which, when reversed back, give us the correct matches in the correct order. This reversing is equivalent to doing a last-finishing greedy match. Clearly, this isn't going to be implemented for KDE3.4, but does seem to be a valid Kate bug.
There is actually no way to change the greediness of the regex search through the GUI, and it looks like our search code blindly accepts Qts defaults. Of course we'd have to manually retri untill matching failed when searching backwards no matter what. I'm not sure if it's worth it. I see this a a bug in QRegExp mostly, except maybe we do wrong in limiting the string the way we do when searching backward (that must be why we have less matches than a bare regex). In any way, getting it to work would be nice indeed. -anders