Version: unspecified (using KDE 4.7.0) OS: Linux I have the following feed: http://www.photonicsjobs.com/rss.xml. Akgreator 4.7.0 keeps downloading the same feeds every day resulting in tons of duplicated among the feed articles. It's not clear when it downloads duplicate articles but it happens quite often. The last article is dated 05 Aug 2011 and I have already 8 copies in my feed folder (so more than 1 article per day). All the articles have the same date/time. Reproducible: Always Steps to Reproduce: Subscribe to the suggested feed, start to monitor the behaviour. Actual Results: A lot of duplicate feed articles. Expected Results: No duplicates.
I'm afraid that's not a bug: I Added this feed and let akregator sync it for a few days. The issue comes from the website which alters the articles URLS. eg with the article "Senior Manufacturing Engineer, Optics, Job Code: 1011" from yesterday: - The url when clicking on "complete story" was http://www.photonicsjobs.com/job//2011-08-28/570 yesterday and today became http://www.photonicsjobs.com/job//2011-08-29/570 Looks like this website changes the GUID value (GUID=Globally Unique Identifier). From an Akregator pov, a new GUID means a new article, hence the article duplication For further details see http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.photonicsjobs.com%2Frss.xml and look at line ~24 (<guid>http://www.photonicsjobs.com/job//2011-08-29/570</guid>) Only the website owner can fix this issue
Thanks for the analysis, I'll contact the webmaster! I have one comment. I have read that the mandatory subelements of <item> are <title>, <link> and <description>. How is akregator detecting old articles in this case (without <guid> elements)? Do you think it's possible to implement a "delete duplicates" function in akregator to clean the mess in this feed? I mean, a function which compares all the mandatory elements in the articles (cited above) without considering <guid>. If you think it makes sense I can open another bug.