Bug 191736

Summary: Twitter More button broken in Konqueror
Product: [Applications] konqueror Reporter: Aaron Williams <aaronw>
Component: generalAssignee: Konqueror Developers <konq-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: aaronw, germain, kde.bugs, kon_chr2000-linux
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: screengrab of "more" button, before and after pressing
workaround
ha, found back that old benchmark Maksim wrote, and tweaked it into a comparison/regression test
new patch, accounting for escapes and (.|\s)*
modified version with some more tests

Description Aaron Williams 2009-05-05 20:57:52 UTC
Version:           4.2.2 (KDE 4.2.2) "release 114" (using 4.2.2 (KDE 4.2.2) "release 114", KDE:KDE4:Factory:Desktop / openSUSE_11.0)
Compiler:          gcc
OS:                Linux (x86_64) release 2.6.25.20-0.1-default

When viewing entries on twitter.com the "more" button does not function. I have no problems with Firefox 3.0.10.
Comment 1 Aaron Williams 2009-05-28 04:12:36 UTC
I am seeing exactly the same behavior as described above. Additionally, the Update button is unresponsive. I have tried setting the user agent to Safari 2.0 and Firefox 2.0.8 and neither fixes the problem.
Comment 2 P B 2009-07-26 18:40:02 UTC
Created attachment 35646 [details]
screengrab of "more" button, before and after pressing
Comment 3 P B 2009-07-26 18:40:48 UTC
Ditto on KDE 4.2.98 (4.3 RC3). The button as displayed switches from saying "more" to a rotating circular graphic, but nothing else happens, even after an hour or more. Tested on multiple Twitter accounts. I was not logged in to Twitter, I don't have an account. Attachment shows, above, un-pressed button, and below, the rotating circular graphical element after the button is pressed.
Comment 4 P B 2010-01-28 13:26:16 UTC
Still broken in 4.3.5 (running on Kubuntu 9.10).

Amazed this is still showing as unconfirmed, it's simplicity itself to test. Go to any Twitter feed page, say http://twitter.com/stephenfry, scroll to the end of the page, and click the "More" button. Watch as nothing loads.

Please fix this, all it takes is one site a user needs regular access to to not work, and they will switch to another browser that does, e.g. Firefox. Twitter is one of the world's most popular websites these days, and so there is potential for losing a lot of users.
Comment 5 Maksim Orlovich 2010-01-28 16:44:23 UTC
If I recall correctly, this issue will go away once Twitter updates jQuery (and when a new version of it will be released). We can't really do anything sane to fix this, since the regexp it uses is simply too inefficient for the regular expression library we use to not run out of stack space on.
Comment 6 P B 2010-01-28 17:53:59 UTC
I see. So is this a problem with kjs or libpcre (or whatever the library is) then, as clearly other browsers are coping with this awful regexp? And the problem with kjs/library too deep rooted to fix?

Since we all know that the sites are never blamed for not working by the user if they know it works on their other computer/browser/at work/etc, it would still be great to fix this (and likely fixing it improves the library in other ways). Twitter aren't going to update without a reason, and this specific issue isn't an issue to them, just people who want to access Twitter via Konqueror.
Comment 7 Germain Garand 2010-01-29 06:12:06 UTC
Created attachment 40337 [details]
workaround

@Maksim : do you remember the workaround we discussed once?
I'll attach the last version of it (it is a bit simpler now), in case you want to consider adding it at some point.

I know this is not pleasing to consider, but I must say I've seen the debug output of it ("Pattern: xxx was rewritten to xxx") pass by while browsing an extremely worrying amount of times - and without it, my stack blows up at a measly 2048 chars :-(
Comment 8 Germain Garand 2010-01-29 09:39:28 UTC
Created attachment 40343 [details]
ha, found back that old benchmark Maksim wrote, and tweaked it into a comparison/regression test
Comment 9 Maksim Orlovich 2010-01-30 21:07:55 UTC
You may be right, this may be the pragmatic thing to do. Are you confident 
that the patch gets the captures exactly the same, though, and that there isn't any escaping to consider or something?

(And man, running this comparison benchmark on Opera 10.10 is hilarious).
Comment 10 Germain Garand 2010-01-31 06:59:34 UTC
> Are you confident that the patch gets the captures exactly the same,

yes, very confident. I removed the handling of the one form that I coudn't find a strict equivalent to, i.e (.|/s)* : "capture last char of zero or more" (I think it is impossible to rewrite in an efficient form, but I may be missing something)

you can see that substrings equivalence is already tested by the benchmark.

ironic thing is those piss poor regexp writters certainly did not intend to capture the last character at all.

>  and that there isn't any escaping to consider or something?

ouch, no, you are right, the patch is broken in that respect... [(.|/s)+] will be happily transformed. Will fix. Fortunately, we already have the parsing logic for that so it should be less painful than poking one's eye out ^^;

> (And man, running this comparison benchmark on Opera 10.10 is hilarious).

Wow! indeed. Older versions did not show such dramatic differences.
Comment 11 Germain Garand 2010-01-31 07:27:40 UTC
ahah, I'm stupid.

(.|\s)* == (?:[\w\W]*([\w\W])|([\w\W])*)

it doesn't matter that the right side of the alternative is inefficient, since it will only match in the 'zero characters' case :-)
Comment 12 Germain Garand 2010-01-31 10:27:59 UTC
Created attachment 40407 [details]
new patch, accounting for escapes and (.|\s)*
Comment 13 Germain Garand 2010-01-31 10:37:30 UTC
Created attachment 40409 [details]
modified version with some more tests

beware that expression R5 is so inefficient it will DOS all js engines.
chrome: Matching time for R5: 2079443ms :D
Comment 14 Chris Espy 2010-02-04 16:40:00 UTC
This is a noob question, but how do I apply to workaround?
Comment 15 Germain Garand 2010-02-08 15:04:33 UTC
so barring any further comments, I'll be queuing the patch for commit in a few days..
Comment 16 Germain Garand 2010-02-12 05:21:10 UTC
SVN commit 1088985 by ggarand:

rewrite a very common and extremely inefficient regular expression pattern
in a form that is equivalent but does not require a recursion.

this is several order of magnitude faster in any RE engine and prevents
libpcre from eating the stack.

BUG: 191736
CCBUG: 189904

 M  +78 -14    regexp.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1088985
Comment 17 Germain Garand 2010-02-12 17:34:05 UTC
SVN commit 1089179 by ggarand:

might want that too... even the dot is using such broken expressions!
----------
automatically merged revision 1088985:
rewrite a very common and extremely inefficient regular expression pattern
in a form that is equivalent but does not require a recursion.

this is several order of magnitude faster in any RE engine and prevents
libpcre from eating the stack.

BUG: 191736

 M  +78 -14    regexp.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1089179