Bug 149191

Summary: Browser closes when hitting page history on wikidot.com
Product: [Applications] konqueror Reporter: Dawid Ciezarkiewicz <dawid.ciezarkiewicz>
Component: kjsAssignee: Konqueror Developers <konq-bugs>
Status: RESOLVED FIXED    
Severity: crash CC: adecorte, bluedzins, bsittler, christian.fontana, debian, jamluca, maksim, ssj2micvm
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:

Description Dawid Ciezarkiewicz 2007-08-25 13:36:22 UTC
Version:           3.5.7 (using KDE KDE 3.5.7)
Installed from:    Unlisted Binary Package
OS:                Linux

How to reproduce:
 - go to any wiki hosted on wikidot.com
 - find any nice looking page
 - hit the "history" button at the bottom of the page

You should see history panel appearing and after a 100ms konqueror disappearing.
Comment 1 Maciej Pilichowski 2007-08-25 14:37:28 UTC
I can cofirm it, example link (was that so hard to add this to report?):
http://wiki.polishlinux.org/
Comment 2 Maciej Pilichowski 2007-08-25 14:39:09 UTC
PS. Could someone please change the status to "crash"? Thanks in advance.
Comment 3 Pino Toscano 2007-08-25 14:41:04 UTC
PS. Could someone please provide a backtrace for the "crash"? Thanks in advance.
Comment 4 Maciej Pilichowski 2007-08-25 15:03:33 UTC
Pino, not every crash enables crash log mechanism -- Konqueror just folds and that's it. But it is a crash.
Comment 5 Pino Toscano 2007-08-25 15:11:29 UTC
... and? That does not mean you cannot get one. Read http://techbase.kde.org/Development/Tutorials/Debugging/How_to_create_useful_crash_reports for more info about that.
Comment 6 Maciej Pilichowski 2007-08-25 15:19:39 UTC
Pino, thanks for the link. Is this any help?

(gdb) run
Starting program: /opt/kde3/bin/konqueror
Failed to read a valid object file image from memory.
[Thread debugging using libthread_db enabled]
[New Thread -1232631584 (LWP 9690)]
konqueror: couldn't get KParts::StatusBarExtension

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1232631584 (LWP 9690)]
0xb5da38f7 in pcre_dfa_exec () from /usr/lib/libpcre.so.0
Comment 7 Sune Vuorela 2007-08-25 15:20:38 UTC
hi! I also see this issue when trying. It looks like a bad bug in libpcre3 - unless konqueror feeds it with invalid info.

(gdb) run --nocrashhandler
Starting program: /usr/bin/konqueror --nocrashhandler
[Thread debugging using libthread_db enabled]
[New Thread -1234617536 (LWP 807)]
Qt: gdb: -nograb added to command-line options.
         Use the -dograb option to enforce grabbing.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1234617536 (LWP 807)]
0xb5bd2af7 in ?? () from /usr/lib/libpcre.so.3
(gdb) bt
#0  0xb5bd2af7 in ?? () from /usr/lib/libpcre.so.3
Cannot access memory at address 0xbf5a5f40
(gdb) bt full
#0  0xb5bd2af7 in ?? () from /usr/lib/libpcre.so.3
No symbol table info available.
Cannot access memory at address 0xbf5a5f40
(gdb) exit
Undefined command: "exit".  Try "help".
(gdb) quit

I can try building libpcre with debug info.

/Sune
Comment 8 Sune Vuorela 2007-08-25 15:35:31 UTC
From http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=400121 - people tells the following:

Mark Baker wrote:
> There is a limit recursion feature of PCRE, which the calling program
> could use.

There appears to be two options here:
1) Punt back to Konqueror, and get them (or whatever they're calling 
that uses libpcre3 - note that libpcre3 is *not* in the dependancies of 
konqueror 4:3.5.5a.dfsg.1-2, so this probably needs doing somewhere 
else) and get them to use the recursion limiting of pcre to clip it at a 
few thousand - your stack trace got to 8156 iterations, so clipping at 
say 4000 or so should be plenty for most sane things while not limiting 
most applications. This is a bit of a hack around, but would fix the 
security issue (well, unless some user sets their stack size even smaller)

end paste.

Such a limit would be nice. This is not the only place set of regexps where konqueror can crash because of stack overflows in libpcre3. This could be treated as a security issue because remote sites can deliberately crash your browser.

/Sune
Comment 9 Sune Vuorela 2007-08-25 17:27:39 UTC
Proposed patch. I guess there is some reordering I got wrong, especially the location of the initial hunk in the patch. Does it maybe belong in regexp.h ?

Thanks to eroyf for learning me c++ ;)


 diff -pruN /home/pusling/svn/kde-3.5-branch/kdelibs/kjs/regexp.cpp  regexp.cpp
--- /home/pusling/svn/kde-3.5-branch/kdelibs/kjs/regexp.cpp     2007-04-09 23:04:53.000000000 +0200
+++ regexp.cpp  2007-08-25 17:20:03.000000000 +0200
@@ -31,6 +31,7 @@

 using namespace KJS;

+   struct pcre_extra extras;
 #ifdef PCRE_CONFIG_UTF8
 RegExp::UTF8SupportState RegExp::utf8Support = RegExp::Unknown;
 #endif
@@ -46,6 +47,8 @@ RegExp::RegExp(const UString &p, int f)
     utf8Support = supported ? Supported : Unsupported;
   }
 #endif
+   extras.flags = 0x0010 ;
+   extras.match_limit_recursion = 1000;

   nrSubPatterns = 0; // determined in match() with POSIX regex.

@@ -339,7 +342,7 @@ UString RegExp::match(const UString &s,
     utf8Support == Supported ? PCRE_NO_UTF8_CHECK :
 #endif
     0;
-  if (pcre_exec(pcregex, NULL, buffer, bufferSize, startPos,
+  if (pcre_exec(pcregex, &extras, buffer, bufferSize, startPos,
                 m_notEmpty ? (PCRE_NOTEMPTY | PCRE_ANCHORED | baseFlags) : baseFlags, // see man pcretest
                 ovector ? *ovector : 0L, ovecsize) == PCRE_ERROR_NOMATCH)
   {
@@ -353,7 +356,7 @@ UString RegExp::match(const UString &s,
       fprintf(stderr, "No match after m_notEmpty. +1 and keep going.\n");
 #endif
       m_notEmpty = 0;
-      if (pcre_exec(pcregex, NULL, buffer, bufferSize, nextPos, baseFlags,
+      if (pcre_exec(pcregex, &extras, buffer, bufferSize, nextPos, baseFlags,
                     ovector ? *ovector : 0L, ovecsize) == PCRE_ERROR_NOMATCH)
         return UString::null;
     }
Comment 10 Maksim Orlovich 2007-08-25 17:54:52 UTC
Yeah, this is one option, it means the RegExp matching doesn't work properly.
The -right- thing to do is to compile libpcre to not use the machine stack.
(This is a dupe, BTW)
Comment 11 Maksim Orlovich 2007-08-25 18:06:11 UTC
OK, there -is- a way of using this more sanely:
There is now PCRE_CONFIG_STACKRECURSE, so we can test whether the limit 
is needed or not... OTOH, I really don't like this difference in behavor depending on PCRE config.. But it's clearly better than crashing, and one 
can cripple our JS code by building PCRE w/o utf8 anyway..
Comment 12 Maksim Orlovich 2007-08-25 18:10:32 UTC
*** Bug 137072 has been marked as a duplicate of this bug. ***
Comment 13 Sune Vuorela 2007-08-25 18:35:27 UTC
Hi!

A quick test seems to show that at least debian, gentoo and suse in the default configuration all haven't pcre compiled with --disable-stack-for-recursion, so I think that testing wether it is set or not and then use a limit like in my patch sounds like a really good idea.

/Sune
Comment 14 Sune Vuorela 2007-08-25 18:57:11 UTC
Like this?

As far as I can read from the documentation, if the flags isn't set correctly, it is as if the values was never set, so this seems to be the right solution.

It is not thoroughly tested though.

--- /home/pusling/svn/kde-3.5-branch/kdelibs/kjs/regexp.cpp     2007-04-09 23:04:53.000000000 +0200
+++ kjs/regexp.cpp      2007-08-25 18:46:46.000000000 +0200
@@ -31,6 +31,7 @@

 using namespace KJS;

+   struct pcre_extra extras ;
 #ifdef PCRE_CONFIG_UTF8
 RegExp::UTF8SupportState RegExp::utf8Support = RegExp::Unknown;
 #endif
@@ -46,6 +47,13 @@ RegExp::RegExp(const UString &p, int f)
     utf8Support = supported ? Supported : Unsupported;
   }
 #endif
+    int stackrecurse;
+    pcre_config(PCRE_CONFIG_STACKRECURSE, (void*)&stackrecurse);
+    if (stackrecurse==1)
+    {
+   extras.flags = 0x0010 ;
+   extras.match_limit_recursion = 1000;
+    }

   nrSubPatterns = 0; // determined in match() with POSIX regex.

@@ -339,7 +347,7 @@ UString RegExp::match(const UString &s,
     utf8Support == Supported ? PCRE_NO_UTF8_CHECK :
 #endif
     0;
-  if (pcre_exec(pcregex, NULL, buffer, bufferSize, startPos,
+  if (pcre_exec(pcregex, &extras, buffer, bufferSize, startPos,
                 m_notEmpty ? (PCRE_NOTEMPTY | PCRE_ANCHORED | baseFlags) : baseFlags, // see man pcretest
                 ovector ? *ovector : 0L, ovecsize) == PCRE_ERROR_NOMATCH)
   {
@@ -353,7 +361,7 @@ UString RegExp::match(const UString &s,
       fprintf(stderr, "No match after m_notEmpty. +1 and keep going.\n");
 #endif
       m_notEmpty = 0;
-      if (pcre_exec(pcregex, NULL, buffer, bufferSize, nextPos, baseFlags,
+      if (pcre_exec(pcregex, &extras, buffer, bufferSize, nextPos, baseFlags,
                     ovector ? *ovector : 0L, ovecsize) == PCRE_ERROR_NOMATCH)
         return UString::null;
     }
Comment 15 Maksim Orlovich 2007-10-03 21:13:36 UTC
*** Bug 150454 has been marked as a duplicate of this bug. ***
Comment 16 Tommi Tervo 2007-10-12 08:13:09 UTC
*** Bug 150558 has been marked as a duplicate of this bug. ***
Comment 17 Tommi Tervo 2007-10-12 14:41:43 UTC
*** Bug 150753 has been marked as a duplicate of this bug. ***
Comment 18 Tommi Tervo 2007-11-17 17:52:43 UTC
*** Bug 152468 has been marked as a duplicate of this bug. ***
Comment 19 Christian Fontana 2007-11-25 13:19:57 UTC
Hi All.
The same errors appears with www.corriere.it and the click on a news.

Now I've installed last stable Kubuntu with KDE 3.5.8

thanks
Comment 20 Maksim Orlovich 2007-11-27 20:28:33 UTC
Base for testcase: (?:<script.*?>)((\n|\r|.)*?)(?:</script>) 
Comment 21 Maksim Orlovich 2008-01-13 19:41:39 UTC
SVN commit 760932 by orlovich:

Limit stack usage of libPCRE (and raise an exception when it runs out of 
stack space, for diagnosibility).

Also, do not accept some super old (>4 year old) pcre versions; 
as they can severely cripple regexp support, and intefere with 
this bugfix. Also tweak the message about missing PCRE in configure 
check --- libPCRE doesn't result in "better" regexp support; the support
w/o it is a last-resort fallback...

Based on patch by Sune Vuorela (username debian, hostname pusling, tld com)
BUG:149191
BUG:151477


 M  +10 -1     CMakeLists.txt  
 M  +22 -8     regexp.cpp  
 M  +1 -1      regexp.h  
 M  +19 -3     regexp_object.cpp  
 M  +6 -1      regexp_object.h  
 M  +13 -6     string_object.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=760932
Comment 22 Maksim Orlovich 2008-01-13 20:06:59 UTC
SVN commit 760945 by orlovich:

Regression test for #149191, #151477
CCBUG:149191
CCBUG:151477


 M  +7 -0      RegExp.js  


WebSVN link: http://websvn.kde.org/?view=rev&revision=760945