Bug 118793 - Problematic characters break feed parsing
Summary: Problematic characters break feed parsing
Status: RESOLVED FIXED
Alias: None
Product: akregator
Classification: Applications
Component: general (show other bugs)
Version: 1.2
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-12-21 14:27 UTC by Pau Capdevila
Modified: 2006-01-31 23:07 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pau Capdevila 2005-12-21 14:27:26 UTC
Version:           1.2 (using KDE 3.5.0, Debian Package 4:3.5.0-2 (testing/unstable))
Compiler:          Target: i486-linux-gnu
OS:                Linux (i686) release 2.6.14-2-k7

The following feed is not displayed well in akregator:

http://www.racocatala.com/rss.php

The XML is well formed. But there are some characters that break something and only one new is showed.

I've tested with other RSS aggregators and:

- liferea breaks the same way as akregator.
- Google reader shows everything well except strange characters.
- Straw works flawlessy.

Thank you,

Kapde
Comment 1 Frank Osterfeld 2005-12-21 17:21:10 UTC
Try feedvalidator.org, this feed is not valid.

- It doesn't list the items in the <channel> section.
- It contains bad characters.
- Values in rdf:about must be unique, but there dupes in the feeds.

This is a degree of brokeness we will not support in Akregator. Go tell racocatala.com how broken their feed is.
Comment 2 Pau Capdevila 2005-12-21 18:20:11 UTC
Oh...Sorry for the annoyance,

I didn't knew that validator. I only validated through

http://www.w3.org/RDF/Validator/

And passed.

Thanks for the fast response and for Akregator, and sorry again. I tried not
post the bug suddenly.

friendly,

Kapde

21 Dec 2005 16:21:12 -0000, Frank Osterfeld <frank.osterfeld@kdemail.net>:
[bugs.kde.org quoted mail]
Oh...Sorry for the annoyance,<br><br>I didn't knew that validator. I only validated through <br><br><font size="-1"><font color="#008000"><span dir="ltr"><a href="http://www.w3.org/RDF/Validator/">http://www.w3.org/RDF/Validator/
</a></span></font></font><br><br>And passed.<br><br>Thanks for the fast response and for Akregator, and sorry again. I tried not post the bug suddenly.<br><br>friendly,<br><br>Kapde<br><br><div><span class="gmail_quote">21 Dec 2005 16:21:12 -0000, Frank Osterfeld &lt;
<a href="mailto:frank.osterfeld@kdemail.net">frank.osterfeld@kdemail.net</a>&gt;:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">------- You are receiving this mail because: -------
<br>You reported the bug, or are watching the reporter.<br><br><a href="http://bugs.kde.org/show_bug.cgi?id=118793">http://bugs.kde.org/show_bug.cgi?id=118793</a><br>frank.osterfeld kdemail net changed:<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; What&nbsp;&nbsp;&nbsp;&nbsp;|Removed&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |Added
<br>----------------------------------------------------------------------------<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Status|UNCONFIRMED&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |RESOLVED<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Resolution|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|WONTFIX<br><br><br><br>------- Additional Comments From 
frank.osterfeld kdemail net&nbsp;&nbsp;2005-12-21 17:21 -------<br>Try <a href="http://feedvalidator.org">feedvalidator.org</a>, this feed is not valid.<br><br>- It doesn't list the items in the &lt;channel&gt; section.<br>- It contains bad characters.
<br>- Values in rdf:about must be unique, but there dupes in the feeds.<br><br>This is a degree of brokeness we will not support in Akregator. Go tell <a href="http://racocatala.com">racocatala.com</a> how broken their feed is.
<br></blockquote></div><br>
Comment 3 Pau Capdevila 2006-01-31 17:20:12 UTC
It has been fixed now:

"This is a valid RSS feed." says.

->Now it works in Firefox too.

But still has encoding problems...

->Your feed appears to be encoded as "iso-8859-1", but your server is reporting "US-ASCII"

Do you still don't support this degree of brokeness?

Thank you.
Comment 4 Frank Osterfeld 2006-01-31 18:05:53 UTC
SVN commit 504290 by osterfeld:

RSS parser: ignore unknown or invalid version attribute value in the <rss> tag and
just assume RSS 2.0. The older formats are compatible to 2.0, so this should work.
(at least better than refusing to parse the feeds)
BUG: 118793



 M  +9 -0      ChangeLog  
 M  +2 -1      src/librss/document.cpp  


--- branches/KDE/3.5/kdepim/akregator/ChangeLog #504289:504290
@@ -2,6 +2,15 @@
 ===================
 (c) 2004-2006 the Akregator authors.
 
+Changes after 1.2.1:
+-----------------------------
+
+Bug fixes:
+
+ 2006/01/31 RSS parser: ignore unknown or invalid version attribute value in the <rss> tag and
+            just assume RSS 2.0. The older formats are compatible to 2.0, so this should work.
+            (at least better than refusing to parse the feeds) (#118793) -fo
+
 Changes after 1.2:
 -----------------------------
 
--- branches/KDE/3.5/kdepim/akregator/src/librss/document.cpp #504289:504290
@@ -110,7 +110,8 @@
             d->version = v0_93;
         else if (attr == QString::fromLatin1("0.94"))
             d->version = v0_94;
-        else if (attr.startsWith("2.0") || attr == QString::fromLatin1("2")) // http://www.breuls.org/rss puts 2.00 in version (BR #0000016)
+        else // otherwise, we just assume a RSS2 compatible feed. As rss2 is generally
+             // backward-compatible, this should work
             d->version = v2_0;
     }
     
Comment 5 Pau Capdevila 2006-01-31 23:07:36 UTC
I really admire you.

I hope one day I'll be half a programmer as you.

Thank you very much indeed!

Pau

On 31 Jan 2006 17:05:54 -0000, Frank Osterfeld <frank.osterfeld@kdemail.net>
wrote:
[bugs.kde.org quoted mail]
I really admire you.<br><br>I hope one day I'll be half a programmer as you.<br><br>Thank you very much indeed!<br><br>Pau<br><br><div><span class="gmail_quote">On 31 Jan 2006 17:05:54 -0000, <b class="gmail_sendername">Frank Osterfeld
</b> &lt;<a href="mailto:frank.osterfeld@kdemail.net">frank.osterfeld@kdemail.net</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
------- You are receiving this mail because: -------<br>You reported the bug, or are watching the reporter.<br><br><a href="http://bugs.kde.org/show_bug.cgi?id=118793">http://bugs.kde.org/show_bug.cgi?id=118793</a><br>frank.osterfeld
 kdemail net changed:<br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; What&nbsp;&nbsp;&nbsp;&nbsp;|Removed&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |Added<br>----------------------------------------------------------------------------<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Status|UNCONFIRMED&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; |RESOLVED
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Resolution|&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|FIXED<br><br><br><br>------- Additional Comments From frank.osterfeld kdemail net&nbsp;&nbsp;2006-01-31 18:05 -------<br>SVN commit 504290 by osterfeld:<br><br>RSS parser: ignore unknown or invalid version attribute value in the &lt;rss&gt; tag and
<br>just assume RSS 2.0. The older formats are compatible to 2.0, so this should work.<br>(at least better than refusing to parse the feeds)<br>BUG: 118793<br><br><br><br> M&nbsp;&nbsp;+9 -0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ChangeLog<br> M&nbsp;&nbsp;+2 -1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;src/librss/document.cpp
<br><br><br>--- branches/KDE/3.5/kdepim/akregator/ChangeLog #504289:504290<br> @ -2,6 +2,15&nbsp;&nbsp;@<br> ===================<br> (c) 2004-2006 the Akregator authors.<br><br>+Changes after 1.2.1:<br>+-----------------------------
<br>+<br>+Bug fixes:<br>+<br>+ 2006/01/31 RSS parser: ignore unknown or invalid version attribute value in the &lt;rss&gt; tag and<br>+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;just assume RSS 2.0. The older formats are compatible to 2.0, so this should work.
<br>+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(at least better than refusing to parse the feeds) (#118793) -fo<br>+<br> Changes after 1.2:<br> -----------------------------<br><br>--- branches/KDE/3.5/kdepim/akregator/src/librss/document.cpp #504289:504290
<br> @ -110,7 +110,8&nbsp;&nbsp;@<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; d-&gt;version = v0_93;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else if (attr == QString::fromLatin1(&quot;0.94&quot;))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; d-&gt;version = v0_94;<br>-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else if (attr.startsWith(&quot;2.0&quot;) || attr == QString::fromLatin1(&quot;2&quot;)) // 
<a href="http://www.breuls.org/rss">http://www.breuls.org/rss</a> puts 2.00 in version (BR #0000016)<br>+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else // otherwise, we just assume a RSS2 compatible feed. As rss2 is generally<br>+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // backward-compatible, this should work
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; d-&gt;version = v2_0;<br>&nbsp;&nbsp;&nbsp;&nbsp; }<br></blockquote></div><br>