Summary: | Crash while running query against bugs.kde.org | ||
---|---|---|---|
Product: | [Applications] konqueror | Reporter: | Josh Berry <des> |
Component: | khtml parsing | Assignee: | Konqueror Developers <konq-bugs> |
Status: | RESOLVED WORKSFORME | ||
Severity: | major | CC: | christian_weilbach, finex, kdedevel, maksim, pablo.pita, Regnaron |
Priority: | NOR | ||
Version: | 4.0 | ||
Target Milestone: | --- | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Attachments: |
Fast patch
Break loop also when the character "<" is the last one of the buffer Stdout log to verify my previous patch |
Description
Josh Berry
2007-12-11 08:55:07 UTC
*** Bug 153803 has been marked as a duplicate of this bug. *** *** Bug 153662 has been marked as a duplicate of this bug. *** Presence of problem confirmed via code inspection --- the doctype parsing code can walk outside of the string willy-nilly, and QString in Qt4 aborts on that (while Qt3 one would return a fallback value). I think this is a borderline showstopper, given that some of the reports involve wikipedia, and the potential for wide impact / nature of regression.. Allan, do you know that code well perchance? If so, would be nice if you could take a look, otherwise I'll try to dig throught it I guess. Yes I know the code and I even have a patch to fix it applied to my local tree. I will see if I can extract it. However my patch only fixes the crash but creates a new problem: The function never gets run to an end. Created attachment 22483 [details]
Fast patch
The patch probably needs some check in KHTMLPart::onFirstData, so the
determineDocType can be run again when more data is available.
The state post-patch is what it is in 3.5.x though, right? *** Bug 153925 has been marked as a duplicate of this bug. *** Created attachment 22514 [details] Break loop also when the character "<" is the last one of the buffer Try from command line: konqueror http://es.wikipedia.org/wiki/Imagen:I_Wikiencuentro_en_la_Bahía_de_Cádiz_\(Asistentes\).jpg For some reason, the buffer in parseDocTypeDeclaration with that URL is only "<". Therefore, there are no more characters after it and bang!. The patch checks for that and it works here. By the way, just to introduce myself, I am the guy in the middle with my little daughter. Created attachment 22516 [details]
Stdout log to verify my previous patch
This is the log from command line I got to verify my previous patch.
So konqueror loads the image succesfully and all is fine.
Look at "my output" in HTMLDocumentImpl::parseDocTypeDeclaration :
konqueror(20349): BUFFER: "<"
konqueror(20349): index: 0 bf.len: 1
This gave me the hint of what was going on in the method. The point is that the
XML header is non existant. I just comment this in case there is also another
bug somewhere else.
Excelent analysis Pablo. Unfortunately it is a well known issue, to solve it correctly requires putting more responsibility in the HTML parser/tokenizer, and thus a larger rewrite. I think though this is a new instance of the bug, because in KDE 3.5.x the HTTP-slave would never send just 1 byte. I would like to know what causes the HTTP-slave to send such a small buffer. It not only reveals this bug, but it is also a waste of resources FYI, checking the stdout log I attached, I see in the BUFFER that all the headers are truncated: pleira@barebone:~$ egrep -C 1 "BUFF|index:" log_from_stdout.log konqueror(20349)/khtml KHTMLGlobal::ref: s_refcnt= 2 konqueror(20349): BUFFER: "<" konqueror(20349): index: 0 bf.len: 1 konqueror(20349)/khtml (html) DOM::HTMLDocumentImpl::determineParseMode: using compatibility parseMode -- konqueror(20349)/khtml KHTMLGlobal::ref: s_refcnt= 2 konqueror(20349): BUFFER: "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transi" konqueror(20349): index: 0 bf.len: 51 konqueror(20349)/khtml (html) DOM::HTMLDocumentImpl::determineParseMode: using compatibility parseMode -- konqueror(20349)/khtml KHTMLGlobal::ref: s_refcnt= 2 konqueror(20349): BUFFER: "<" konqueror(20349): index: 0 bf.len: 1 konqueror(20349)/khtml (html) DOM::HTMLDocumentImpl::determineParseMode: using compatibility parseMode *** Bug 154312 has been marked as a duplicate of this bug. *** SVN commit 750614 by carewolf: Don't crash bugs.kde.org and other places, even if we risk misdetermining doctype CCBUG: 153827 M +17 -2 html_documentimpl.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=750614 Can't reproduce, so I guess this is fixed by the patch sent by carewolf? This does appear to be fixed in trunk. I can no longer reproduce either. Cannot reproduce on r797319 too. Other people are confirming that the crash doesn't happen anymore in trunk. Someone should mark this as RESOLVED/FIXED (I don't have permission). Ok |