123424 – identify articles by url and not by text and title

Bug 123424 - identify articles by url and not by text and title

Summary: identify articles by url and not by text and title

Status:	REPORTED

Alias:	None

Product:	akregator
Classification:	Applications
Component:	general (show other bugs)
Version:	unspecified
Platform:	Gentoo Packages Linux

Importance:	NOR wishlist
Target Milestone:	---
Assignee:	kdepim bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2006-03-11 12:03 UTC by uran238
Modified:	2021-03-09 04:12 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed In:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description uran238 2006-03-11 12:03:29 UTC

Version:            (using KDE KDE 3.5.1)
Installed from:    Gentoo Packages
OS:                Linux

Some news sources alter the text or title to the article in the feed, so akregator will mark it as new. But it isn't new, most times just a typo was fixed. Often you noticed the typo yourself and if not you don't want to be informed about ;) 
So I think it would be usefull to identify an article by its url and not by the title and text. (I don't know how akregator works internaly, but I'm quite sure it does so) 
The articles should be treated as usual. They should be marked as read, if the old one was already read and as new if the old one was not read. 
In most cases it isn't usefull to keep the old article, because the text on the website has already changed. 
Maybe it would be usefull to mark an article as new again, if the user marked the old one as important?

What do you think?

Comment 1 Frank Osterfeld 2006-03-11 12:32:07 UTC

> Some news sources alter the text or title to the article in the feed, so
> akregator will mark it as new. But it isn't new, most times just a typo was
> fixed. Often you noticed the typo yourself and if not you don't want to be 
> informed about ;)
I agree that resetting them to "New" isn't what you want in most cases. There *might* be updates to the item, but most of the time it's just typos.

> So I think it would be usefull to identify an article by its url and not by
> the title and text. (I don't know how akregator works internaly, but I'm
> quite sure it does so) 

If the article has a <guid> (RSS) or <id> (Atom), we use that. If not, we use title + content. Using the url is not a good idea in general, as there are feeds where the link points always to the same site, or there is no link at all. Admittedly, these cases a rare. We might evaluate heuristics like "if time stamp and link are equal, it's the same article" though.

Comment 2 Justin Zobel 2021-03-09 04:12:11 UTC

Thank you for the bug report.

As this report hasn't seen any changes in 5 years or more, we ask if you can please confirm that the issue still persists.

If this bug is no longer persisting or relevant please change the status to resolved.