Bug 243289 - Akregator does not handle accented letters in urls
Summary: Akregator does not handle accented letters in urls
Status: RESOLVED NOT A BUG
Alias: None
Product: akregator
Classification: Applications
Component: internal browser (show other bugs)
Version: 1.6.5
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-30 22:32 UTC by tnemeth
Modified: 2010-11-26 20:51 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tnemeth 2010-06-30 22:32:05 UTC
Version:           1.6.5 (using Devel) 
OS:                Linux

When trying to display an article in a separate tab, akregator cannot display
web pages that have accents in their urls. Here is the (english translated)
message displayed :

---8<--- CUT HERE ---8<---
Request details :
URL : http://www.libelyon.fr/info/2010/06/le-pr�sident-de-la-cci-de-lyon-en-garde�vue.html
Protocol : http
Date and time : wednesday 30 june 2010 22:05
Addon informations : www.libelyon.fr
Description :
The given file or folder /info/2010/06/le-pr�sident-de-la-cci-de-lyon-en-garde�vue.html does not exist.
---8<--- CUT HERE ---8<---

   The URL should be : http://www.libelyon.fr/info/2010/06/le-président-de-la-cci-de-lyon-en-gardeàvue.html


Reproducible: Always

Steps to Reproduce:
01. have a news feed that propose web pages with accented letters (eg: at www.liberation.fr)
10. click on the "full article" link so that it is opened in a tab
11. see the error message

Actual Results:  
An error message is displayed in the tab instead of the web page. The error message says that the URL could not be found.

Expected Results:  
The web page should be displayed as in konqueror.

OS: Linux (x86_64) release 2.6.32-23-generic
Compiler: cc
Comment 1 Christophe Marin 2010-06-30 23:11:49 UTC
Can't reproduce, the URL is encoded as expected.
Comment 2 tnemeth 2010-07-01 08:07:18 UTC
(In reply to comment #1)
> Can't reproduce, the URL is encoded as expected.

How can you say that ?
One can clearly see that the displayed url in the error message does not use the correct charset. Maybe you're not using the 4.4.5 Akregator version ? Here, even
with konqueror, if I click on the URL given by akregator, the page cannot be
displayed. But the right one can be.
Comment 3 Frank Osterfeld 2010-07-04 21:14:45 UTC
Can't reproduce either, using 4.5 branch. If there was a bug, it would be most probably in Qt Xml.
If something like this happens, usually the encoding in the feed is wrong. Which feed do you use? http://www.libelyon.fr/info/rss.xml ? If yes, please retry if it still happens (for new items). If no, please tell us the feed you use.
Comment 4 tnemeth 2010-07-05 08:02:34 UTC
(In reply to comment #3)
> Can't reproduce either, using 4.5 branch. If there was a bug, it would be
> most probably in Qt Xml.

    Ok :( It's sad that I'm the only one viewing this... And it doesn't
    happened only once.


> If something like this happens, usually the encoding in the feed is wrong.

    Unfortunately I did not checked the xml file :(


> Which feed do you use? http://www.libelyon.fr/info/rss.xml ? If yes, please
> retry if it still happens (for new items). If no, please tell us the feed you
> use.

    I'm using <URL: http://www.liberation.fr/rss/la-une,9>. I'll try to have
    a look at the xml file next time I see this problem.

Thanks.
Comment 5 Frank Osterfeld 2010-07-05 18:35:02 UTC
For which links of the current feed it doesn't work for you? Does it happen for all links with accented letters, or only for some?
I tried with your feed, but there I don't see accents in the links. E.g., 

http://rss.feedsportal.com/c/32268/f/438243/s/bb21f2d/l/0L0Sliberation0Bfr0Cterre0C0A10A16452770Ede0Enouveaux0Esystemes0Ed0Ealarme0Epour0Emieux0Edetecter0Eles0Einte mperies/story01.htm
Although e.g. "systemes" and "detecter" have accents in it.
And the link works.
Comment 6 tnemeth 2010-07-06 09:48:36 UTC
(In reply to comment #5)
> For which links of the current feed it doesn't work for you? Does it happen
> for all links with accented letters, or only for some?

    Only for some.
    For example, the one you gave worked perfectly but this one:

    http://rss.feedsportal.com/c/32268/f/438243/s/ba42ba9/l/0Lalters0Bblogs0Bliberation0Bfr0Cistanbul0C20A10A0C0A70Cla0Eculture0Eturque0Eest0E0Je0A0Evendre0Bhtml/story01.htm

    doesn't work from within akregator. Clicking on the "Full article" link
    to open it in an external browser does not work either: the url is
    malformed.
Comment 7 Frank Osterfeld 2010-07-06 18:58:54 UTC
The feed file doesn't contain any special characters in the URLs, they just contain those 0E, 0B etc. escape sequences. 
It's not proper percent-encoding either as the "%" characters are missing. So I think this is some server-side mechanism which apparently doesn't work for some URLs. I tested the link in "Un sous-oficier français tué en Afghanistan", it doesn't work in Akregator, Liferea  nor Google reader. So I consider this a bug in the server/the feed generating software.
Comment 8 Michal Wazgird 2010-11-26 20:51:10 UTC
I can reproduce this bug with this RSS: http://www.optyczne.pl/rss.xml

Each link has "nowo&#347;&#263;" in URL in XML file. Firefox correctly resolves it to "nowość", but Akregator tries to open "nowo??".

Checked in Akregator version 1.6.5