Bug 120760 - RSS/Atom parser does not handle namespaces correctly
Summary: RSS/Atom parser does not handle namespaces correctly
Status: RESOLVED WORKSFORME
Alias: None
Product: akregator
Classification: Applications
Component: feed parser (show other bugs)
Version: 1.2
Platform: Fedora RPMs Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-25 05:19 UTC by Chris Fritz
Modified: 2008-10-26 22:48 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Non-default namespace test (1.32 KB, application/atom+xml)
2006-01-25 05:23 UTC, Chris Fritz
Details
Non-default XHTML namespace test (1.35 KB, application/atom+xml)
2006-01-25 05:25 UTC, Chris Fritz
Details
Prefixed XHTML with unprefixed fake namespace (1.07 KB, application/atom+xml)
2006-01-25 05:30 UTC, Chris Fritz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Fritz 2006-01-25 05:19:31 UTC
Version:           1.2 (using KDE KDE 3.5.0)
Installed from:    Fedora RPMs
OS:                Linux

I wasn't sure whether to make these as different bugs, but they all have the same problem at the base.

Firstly, Akregator lacks namespace support when loading Atom with a non-default namespace.

For example, for the following feed...

<a:feed xmlns:a="http://www.w3.org/2005/Atom"
 xmlns="http://www.w3.org/1999/xhtml">
 <a:title>Feed Title</a:title>
</a:feed>

...Akregator does not recognize XHTML a:feed and a:title as being in the given a: Atom namespace.

Secondly, in this piece of feed...
<feed xmlns="http://www.w3.org/2005/Atom"
 xmlns:h="http://www.w3.org/1999/xhtml">
 <entry>
 <content type="xhtml">
 <h:div>
 <h:ul>
 <h:li>List item.</h:li>
 <h:li>List item.</h:li>
 <h:li><h:a href="http://kde.org/">Link to <h:abbr title="K Desktop Environment">KDE</h:abbr> web site.</h:a></h:li>
 </content>
 </entry>
</feed>

...it's expected that the h: namespace's elements be rendered as the defined XHTML namespace.  They are wrongly not rendered as HTML.

Thirdly, in the following feed...

<feed xmlns="http://www.w3.org/2005/Atom"
 xmlns:h="http://www.w3.org/1999/xhtml">
 <entry>
 <content type="xhtml">
 <h:div>
 <h:ul>
 <h:li>
 This IS in the XHTML namespace, and SHOULD be rendered as a list item. 
 Akregator DOES NOT render it as a list item.
 </h:li>
 </h:ul>
 <ul>
 <li>
 This IS NOT in the XHTML namespace, and SHOULD NOT be rendered as a list item.
 Akregator DOES render it as a list item.</li>
 </ul>
 </h:div>
 </content>
 </entry>
</feed>

...the former list elements with the given h: XHTML namespace are wrongly not rendered as XHTML.  The latter list elements with no given namespace are wrongly rendered as XHTML.

These examples are based on the <a href="http://www.intertwingly.net/wiki/pie/XmlNamespaceConformanceTests?action=highlight&value=CategoryConformanceTests">XML Namespace Conformance Tests</a>.
Comment 1 Chris Fritz 2006-01-25 05:23:40 UTC
Created attachment 14378 [details]
Non-default namespace test

In this testcase, the default namespace is XHTML, and a: is the Atom namespace.
 Akregator does not understand these namespaces, and is unable to read any data
from the feed.

It's expected that anything in the a: namespace be understood as Atom element.
Comment 2 Chris Fritz 2006-01-25 05:25:52 UTC
Created attachment 14379 [details]
Non-default XHTML namespace test

In this testcase, the h: namespace is used for XHTML elements.	Akregator does
not render the h: namespaced elements as XHTML, instead rendering them as
unknown elements.

It's expected that anything in the defined h: namespace be understood as XHTML
element.
Comment 3 Chris Fritz 2006-01-25 05:30:47 UTC
Created attachment 14380 [details]
Prefixed XHTML with unprefixed fake namespace

In this testcase, the h: namespace is used for XHTML elements, and children of
the h:div are given a fake namespace (non-existing markup language).  Akregator
improperly renders all children elements of the h:div.

It's expected that any child of h:div without a given namespace is in the fake
namespace applied to the h:div.  Akregator instead renders them in the XHTML
namespace, perhaps improperly applying the XHTML namespace by default.

It's expected that any element with the h: namespace renders as XHTML, per the
related namespace declaration.	Akregator does not render these elements as
XHTML.
Comment 4 Frank Osterfeld 2006-01-25 08:41:39 UTC
The current parser does not support namespaces at all (it isn't enabled when
reading the XML with the Qt XML parser) , and unfortunately it can't be fixed easily without breaking the parser in other places.
I am currently working on a new parser lib, with full namespace support, which will replace the current parser in KDE4.
Comment 5 Frank Osterfeld 2008-10-26 22:48:20 UTC
Works now in >= 4.1.
The xhtml rendering using h: doesn't work with KHTML, but the namespace is correctly set by akregator. So the remaining problems are KHTML issues as I see it.