Version: 1.2 (using KDE 3.4.2, Kubuntu Package 4:3.4.2-0ubuntu0hoary2 ) Compiler: gcc version 3.3.5 (Debian 1:3.3.5-8ubuntu2) OS: Linux (i686) release 2.6.10-5-386 In http://www.blogistan.co.uk/qt/atom.xml , <![CDATA[ ... ]]> is used to mask the articles. These CDATA tags belong to the XML file and should therefore not get passed to KHTML. At the moment, they do get passed to KHTML, resulting in strange rendering results.
This example is not Atom-1.0 compliant. In Atom, CDATA seems not valid in <content type="html">, according to http://www.atomenabled.org/developers/syndication/#text "If type="html", then this element contains entity escaped html. <title type="html"> AT&amp;T bought <b>by SBC</b>! </title>" So the feed should use escaped HTML instead of CDATA.
http://www.w3.org/TR/2004/REC-xml-20040204/#sec-cdata-sect says: "[Definition: CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup. CDATA sections begin with the string "<![CDATA[" and end with the string "]]>":]"
*** Bug 116051 has been marked as a duplicate of this bug. ***
SVN commit 498704 by osterfeld: fix atom:content parsing: Don't show tags when for Atom 1.0 feeds with escaped HTML in it BUG: 112491, 117938 M +36 -15 tools_p.cpp --- branches/KDE/3.5/kdepim/akregator/src/librss/tools_p.cpp #498703:498704 @@ -47,21 +47,42 @@ QDomElement e = node.toElement(); QString result; - if (elemName == "content" && ((e.hasAttribute("mode") && e.attribute("mode") == "xml") || !e.hasAttribute("mode"))) - result = childNodesAsXML(node); - else - result = e.text(); - - bool hasPre = result.contains("<pre>",false); - bool hasHtml = hasPre || result.contains("<"); // FIXME: test if we have html, should be more clever -> regexp - if(!isInlined && !hasHtml) // perform nl2br if not a inline elt and it has no html elts - result = result = result.replace(QChar('\n'), "<br />"); - if(!hasPre) // strip white spaces if no <pre> - result = result.simplifyWhiteSpace(); - - if (result.isEmpty()) - return QString::null; - + bool doHTMLCheck = true; + + if (elemName == "content") // we have Atom here + { + doHTMLCheck = false; + // the first line is always the Atom 0.3, the second Atom 1.0 + if (( e.hasAttribute("mode") && e.attribute("mode") == "escaped" && e.attribute("type") == "text/html" ) + || (!e.hasAttribute("mode") && e.attribute("type") == "html")) + { + result = KCharsets::resolveEntities(e.text().simplifyWhiteSpace()); // escaped html + } + else if (( e.hasAttribute("mode") && e.attribute("mode") == "escaped" && e.attribute("type") == "text/plain" ) + || (!e.hasAttribute("mode") && e.attribute("type") == "text")) + { + result = e.text().stripWhiteSpace(); // plain text + } + else if (( e.hasAttribute("mode") && e.attribute("mode") == "xml" ) + || (!e.hasAttribute("mode") && e.attribute("type") == "xhtml")) + { + result = childNodesAsXML(e); // embedded XHMTL + } + + } + + if (doHTMLCheck) // check for HTML; not necessary for Atom:content + { + bool hasPre = result.contains("<pre>",false); + bool hasHtml = hasPre || result.contains("<"); // FIXME: test if we have html, should be more clever -> regexp + if(!isInlined && !hasHtml) // perform nl2br if not a inline elt and it has no html elts + result = result = result.replace(QChar('\n'), "<br />"); + if(!hasPre) // strip white spaces if no <pre> + result = result.simplifyWhiteSpace(); + + if (result.isEmpty()) + return QString::null; + } return result; }
This bug has only been fixed for Atom, not for RSS. Reopened it therefore.
*** Bug 122857 has been marked as a duplicate of this bug. ***
Same here. Gentoo ~amd64 kde 3.5.6 Please fix this annoying bug!
considered fixed in 4.x, reopen with a curren test feed (xml file, not link( otherwise