| Summary: | [test case] getElementsByTagName does not find elements that are not visible | ||
|---|---|---|---|
| Product: | [Applications] konqueror | Reporter: | Paul Pacheco <paulpach> |
| Component: | khtml ecma | Assignee: | Konqueror Bugs <konqueror-bugs-null> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | faure, germain |
| Priority: | NOR | ||
| Version First Reported In: | unspecified | ||
| Target Milestone: | --- | ||
| Platform: | Gentoo Packages | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: |
Test case showing problem
Patch to fix getElementsByTagName problem with XHTML suggested patch Strict doctype transitional mode |
||
|
Description
Paul Pacheco
2004-08-02 19:04:00 UTC
Created attachment 6969 [details]
Test case showing problem
Test case showing the problem.
If opened with mozilla or IE, it finds 3 images,
if opened with konqueror, it finds 0 images.
Hi, The problem is not the invisible images, but the fact xhtml works with lowercase tags and you search with uppercase. I suppose the other browsers have some compatibility mode, so I'll attach a patch that also makes konq/html a bit more fogiving in this case. Cheers, Rob. Created attachment 7664 [details]
Patch to fix getElementsByTagName problem with XHTML
Re #3: Sorry but I don't think the patch is acceptable ; it's a more complicated issue. The behaviour you observe with Mozilla is because they implemented special compatibility rules for XHTML served as media type "text/html", that are defined here: http://www.w3.org/TR/xhtml-media-types/ This is a tricky section, full of "should" or "may", but basically, what we need to implement to follow this is: - if the document is an XHTML document and is served as "text/html" - then the lookups for elements by non-ns aware DOM HTML methods should be case insensitive, and the elements should be returned uppercase. Even some staff from W3C use the XHTML as "text/html" is HTML behavior. Like on this page: http://www.w3.org/DOM/Test/ The behaviour is also defined in http://www.w3.org/TR/xhtml1/#C_11 "user agents that access XHTML documents served as Internet media type text/html via the DOM can use the HTML DOM, and can rely upon element and attribute names being returned in upper-case from those interfaces." From a quick look at the code, this would need an additional param in createHTMLDocument() (coming from args.serviceType, to check whether it's text/html or application/xhtml+xml), passed to HTMLDocumentImpl's constructor, and stored there, and then using it in the line e->setHTMLCompat( htmlMode() != XHtml ) (html/html_documentimpl.cpp:211) or in HTMLDocumentImpl::determineParseMode() itself (depending on whether this should affect the parsing, or only the html-compat mode). Created attachment 7852 [details]
suggested patch
Hi David! the patch looks excellent... I wonder though about the alternative you describe in #6, if it should indeed affect parsing... because if it does, there's not much point in having an xhtml doctype at all? Does Mozilla do that too? Good point. Do you know how I could write a test that checks whether the parsing is done with XHTML or with HTML4-Compat mode? [We tried <script/> but that was a bad idea - htmltokenizer doesn't support it, it always looks for </script>.] BTW with xhtml doctype and .xhtml extension, Mozilla treats the file like XML - it shows the raw tree. This surprises me, I thought the idea behind xhtml was still to render it as HTML :) In fact there are very few differences between parsing modes within the Html parser... it is always very forgiving. The CSS Parser does make more differences - notably case sensitivity.
At the moment, Transitional and Compat behaves identically, so that's really just a difference between Strict and everything else ;(
(but there are very significative differences in _rendering_ between Strict mode and others - i.e the quirk mode).
For instance, given an XHtml document with doctype Strict, but served as "text/html",
the compatibility guidelines would require the DOM interface to be case insensitive, but the stylesheets still would be case sensitive...
> BTW with xhtml doctype and .xhtml extension, Mozilla treats the file like XML - it
> shows the raw tree. This surprises me, I thought the idea behind xhtml was still to
> render it as HTML :)
they are always overdoing it :)
Created attachment 7872 [details]
Strict doctype
Created attachment 7873 [details]
transitional mode
Mozilla follows strictly the guidelines, i.e only attribute and element names
are case insensitive and returned uppercase.
Thanks for the testcases, I have merged them into my regression/tests testcases. The css parsing thing didn't test the actual patch though (they work before and after), to answer whether it's ok to change the parse mode. But I found a testcase that shows that it's in fact not ok to do so. When publicId=strict and systemId=transitional, the code says "for XHTML, trust publicId, so be strict". But with my patch it now choose transitional since hMode != XHtml, whereas Mozilla does choose strict indeed [this particular testcase will be committed as tests/parser/compatmode_xhtml_mixed.html soon]. I also couldn't change the e->setHTMLCompat() line, there are in fact many such lines, all activating htmlcompat from many places. So I set htmlMode to non-xhtml at the end of determineParseMode, when all the rest has been done (so this doesn't influence e.g. the CSS parsing mode). CVS commit by faure:
When the document is loaded as text/html, even if xhtml doctype, activate case-insensitive
("htmlCompat") lookup of tags and attributes (but not in CSS parser). (#86446)
Regression tests: dom/namespaces.html dom/namespaces_xhtml_strict.html parser/compatmode*
CCMAIL: 86446-done@bugs.kde.org
M +6 -0 ChangeLog 1.297
M +2 -0 khtml_part.cpp 1.1030
M +5 -0 html/html_documentimpl.cpp 1.168
M +3 -1 html/html_documentimpl.h 1.76
--- kdelibs/khtml/html/html_documentimpl.cpp #1.167:1.168
@@ -71,4 +71,5 @@ HTMLDocumentImpl::HTMLDocumentImpl(DOMIm
m_doAutoFill = false;
+ m_htmlRequested = false;
/* dynamic history stuff to be fixed later (pfeiffer)
@@ -382,4 +383,8 @@ void HTMLDocumentImpl::determineParseMod
if ( hMode == XHtml )
pMode = publicId;
+
+ // This needs to be done last, see tests/parser/compatmode_xhtml_mixed.html
+ if ( m_htmlRequested && hMode == XHtml )
+ hMode = Html4; // make all tags uppercase when served as text/html (#86446)
}
// kdDebug() << "DocumentImpl::determineParseMode: publicId =" << publicId << " systemId = " << systemId << endl;
--- kdelibs/khtml/html/html_documentimpl.h #1.75:1.76
@@ -74,4 +74,5 @@ public:
void setAutoFill() { m_doAutoFill = true; }
+ void setHTMLRequested( bool html ) { m_htmlRequested = html; }
HTMLCollectionImpl::CollectionInfo *collectionInfo(int type) { return m_collection_info+type; }
@@ -83,4 +84,5 @@ protected:
QMap<QString,HTMLMapElementImpl*> mapMap;
bool m_doAutoFill;
+ bool m_htmlRequested;
protected slots:
--- kdelibs/khtml/khtml_part.cpp #1.1029:1.1030
@@ -1778,4 +1778,6 @@ void KHTMLPart::begin( const KURL &url,
} else {
d->m_doc = DOMImplementationImpl::instance()->createHTMLDocument( d->m_view );
+ // HTML or XHTML? (#86446)
+ static_cast<HTMLDocumentImpl *>(d->m_doc)->setHTMLRequested( args.serviceType != "application/xhtml+xml" );
}
#ifndef KHTML_NO_CARET
--- kdelibs/khtml/ChangeLog #1.296:1.297
@@ -1,2 +1,8 @@
+2004-10-14 David Faure <faure@kde.org>
+
+ * html/html_documentimpl.cpp (determineParseMode):
+ When the document is loaded as text/html, even if xhtml doctype, activate case-insensitive
+ ("htmlCompat") lookup of tags and attributes (but not in CSS parser). (#86446)
+
2004-10-14 Allan Sandfeld Jensen <kde@carewolf.com>
* rendering/*.*: WebCore merge/port of layouted->needsLayout
|