Summary: | [test case] getElementsByTagName does not find elements that are not visible | ||
---|---|---|---|
Product: | [Applications] konqueror | Reporter: | Paul Pacheco <paulpach> |
Component: | khtml ecma | Assignee: | Konqueror Developers <konq-bugs> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | faure, germain |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Gentoo Packages | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
Test case showing problem
Patch to fix getElementsByTagName problem with XHTML suggested patch Strict doctype transitional mode |
Description
Paul Pacheco
2004-08-02 19:04:00 UTC
Created attachment 6969 [details]
Test case showing problem
Test case showing the problem.
If opened with mozilla or IE, it finds 3 images,
if opened with konqueror, it finds 0 images.
Hi, The problem is not the invisible images, but the fact xhtml works with lowercase tags and you search with uppercase. I suppose the other browsers have some compatibility mode, so I'll attach a patch that also makes konq/html a bit more fogiving in this case. Cheers, Rob. Created attachment 7664 [details]
Patch to fix getElementsByTagName problem with XHTML
Re #3: Sorry but I don't think the patch is acceptable ; it's a more complicated issue. The behaviour you observe with Mozilla is because they implemented special compatibility rules for XHTML served as media type "text/html", that are defined here: http://www.w3.org/TR/xhtml-media-types/ This is a tricky section, full of "should" or "may", but basically, what we need to implement to follow this is: - if the document is an XHTML document and is served as "text/html" - then the lookups for elements by non-ns aware DOM HTML methods should be case insensitive, and the elements should be returned uppercase. Even some staff from W3C use the XHTML as "text/html" is HTML behavior. Like on this page: http://www.w3.org/DOM/Test/ The behaviour is also defined in http://www.w3.org/TR/xhtml1/#C_11 "user agents that access XHTML documents served as Internet media type text/html via the DOM can use the HTML DOM, and can rely upon element and attribute names being returned in upper-case from those interfaces." From a quick look at the code, this would need an additional param in createHTMLDocument() (coming from args.serviceType, to check whether it's text/html or application/xhtml+xml), passed to HTMLDocumentImpl's constructor, and stored there, and then using it in the line e->setHTMLCompat( htmlMode() != XHtml ) (html/html_documentimpl.cpp:211) or in HTMLDocumentImpl::determineParseMode() itself (depending on whether this should affect the parsing, or only the html-compat mode). Created attachment 7852 [details]
suggested patch
Hi David! the patch looks excellent... I wonder though about the alternative you describe in #6, if it should indeed affect parsing... because if it does, there's not much point in having an xhtml doctype at all? Does Mozilla do that too? Good point. Do you know how I could write a test that checks whether the parsing is done with XHTML or with HTML4-Compat mode? [We tried <script/> but that was a bad idea - htmltokenizer doesn't support it, it always looks for </script>.] BTW with xhtml doctype and .xhtml extension, Mozilla treats the file like XML - it shows the raw tree. This surprises me, I thought the idea behind xhtml was still to render it as HTML :) In fact there are very few differences between parsing modes within the Html parser... it is always very forgiving. The CSS Parser does make more differences - notably case sensitivity.
At the moment, Transitional and Compat behaves identically, so that's really just a difference between Strict and everything else ;(
(but there are very significative differences in _rendering_ between Strict mode and others - i.e the quirk mode).
For instance, given an XHtml document with doctype Strict, but served as "text/html",
the compatibility guidelines would require the DOM interface to be case insensitive, but the stylesheets still would be case sensitive...
> BTW with xhtml doctype and .xhtml extension, Mozilla treats the file like XML - it
> shows the raw tree. This surprises me, I thought the idea behind xhtml was still to
> render it as HTML :)
they are always overdoing it :)
Created attachment 7872 [details]
Strict doctype
Created attachment 7873 [details]
transitional mode
Mozilla follows strictly the guidelines, i.e only attribute and element names
are case insensitive and returned uppercase.
Thanks for the testcases, I have merged them into my regression/tests testcases. The css parsing thing didn't test the actual patch though (they work before and after), to answer whether it's ok to change the parse mode. But I found a testcase that shows that it's in fact not ok to do so. When publicId=strict and systemId=transitional, the code says "for XHTML, trust publicId, so be strict". But with my patch it now choose transitional since hMode != XHtml, whereas Mozilla does choose strict indeed [this particular testcase will be committed as tests/parser/compatmode_xhtml_mixed.html soon]. I also couldn't change the e->setHTMLCompat() line, there are in fact many such lines, all activating htmlcompat from many places. So I set htmlMode to non-xhtml at the end of determineParseMode, when all the rest has been done (so this doesn't influence e.g. the CSS parsing mode). CVS commit by faure: When the document is loaded as text/html, even if xhtml doctype, activate case-insensitive ("htmlCompat") lookup of tags and attributes (but not in CSS parser). (#86446) Regression tests: dom/namespaces.html dom/namespaces_xhtml_strict.html parser/compatmode* CCMAIL: 86446-done@bugs.kde.org M +6 -0 ChangeLog 1.297 M +2 -0 khtml_part.cpp 1.1030 M +5 -0 html/html_documentimpl.cpp 1.168 M +3 -1 html/html_documentimpl.h 1.76 --- kdelibs/khtml/html/html_documentimpl.cpp #1.167:1.168 @@ -71,4 +71,5 @@ HTMLDocumentImpl::HTMLDocumentImpl(DOMIm m_doAutoFill = false; + m_htmlRequested = false; /* dynamic history stuff to be fixed later (pfeiffer) @@ -382,4 +383,8 @@ void HTMLDocumentImpl::determineParseMod if ( hMode == XHtml ) pMode = publicId; + + // This needs to be done last, see tests/parser/compatmode_xhtml_mixed.html + if ( m_htmlRequested && hMode == XHtml ) + hMode = Html4; // make all tags uppercase when served as text/html (#86446) } // kdDebug() << "DocumentImpl::determineParseMode: publicId =" << publicId << " systemId = " << systemId << endl; --- kdelibs/khtml/html/html_documentimpl.h #1.75:1.76 @@ -74,4 +74,5 @@ public: void setAutoFill() { m_doAutoFill = true; } + void setHTMLRequested( bool html ) { m_htmlRequested = html; } HTMLCollectionImpl::CollectionInfo *collectionInfo(int type) { return m_collection_info+type; } @@ -83,4 +84,5 @@ protected: QMap<QString,HTMLMapElementImpl*> mapMap; bool m_doAutoFill; + bool m_htmlRequested; protected slots: --- kdelibs/khtml/khtml_part.cpp #1.1029:1.1030 @@ -1778,4 +1778,6 @@ void KHTMLPart::begin( const KURL &url, } else { d->m_doc = DOMImplementationImpl::instance()->createHTMLDocument( d->m_view ); + // HTML or XHTML? (#86446) + static_cast<HTMLDocumentImpl *>(d->m_doc)->setHTMLRequested( args.serviceType != "application/xhtml+xml" ); } #ifndef KHTML_NO_CARET --- kdelibs/khtml/ChangeLog #1.296:1.297 @@ -1,2 +1,8 @@ +2004-10-14 David Faure <faure@kde.org> + + * html/html_documentimpl.cpp (determineParseMode): + When the document is loaded as text/html, even if xhtml doctype, activate case-insensitive + ("htmlCompat") lookup of tags and attributes (but not in CSS parser). (#86446) + 2004-10-14 Allan Sandfeld Jensen <kde@carewolf.com> * rendering/*.*: WebCore merge/port of layouted->needsLayout |