Summary: | akregator keeps downloading old feed articles -> many duplicates | ||
---|---|---|---|
Product: | [Applications] akregator | Reporter: | Fabio Rossi <rossi.f> |
Component: | general | Assignee: | kdepim bugs <kdepim-bugs> |
Status: | RESOLVED UPSTREAM | ||
Severity: | normal | ||
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Gentoo Packages | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Fabio Rossi
2011-08-10 17:59:39 UTC
I'm afraid that's not a bug: I Added this feed and let akregator sync it for a few days. The issue comes from the website which alters the articles URLS. eg with the article "Senior Manufacturing Engineer, Optics, Job Code: 1011" from yesterday: - The url when clicking on "complete story" was http://www.photonicsjobs.com/job//2011-08-28/570 yesterday and today became http://www.photonicsjobs.com/job//2011-08-29/570 Looks like this website changes the GUID value (GUID=Globally Unique Identifier). From an Akregator pov, a new GUID means a new article, hence the article duplication For further details see http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.photonicsjobs.com%2Frss.xml and look at line ~24 (<guid>http://www.photonicsjobs.com/job//2011-08-29/570</guid>) Only the website owner can fix this issue Thanks for the analysis, I'll contact the webmaster! I have one comment. I have read that the mandatory subelements of <item> are <title>, <link> and <description>. How is akregator detecting old articles in this case (without <guid> elements)? Do you think it's possible to implement a "delete duplicates" function in akregator to clean the mess in this feed? I mean, a function which compares all the mandatory elements in the articles (cited above) without considering <guid>. If you think it makes sense I can open another bug. |