Version: (using KDE 4.4.1) Installed from: Debian testing/unstable Packages This bug has been copied over from http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471930 and has been verified to still exist in KDE SC 4.4.1 ---- When you open a page whose URL contains characters that must be urlencoded, Konqueror will let you enter the URL properly encoded (with % escapes, etc.) and visit the page correctly. However, it will decode the URL and display the decoded URL in the address bar (e.g. + will be changed to space, %2B will be changed to +, etc.) This often causes the "URL" in the address bar to not actually be a valid URL. For example, if you select the address bar and press return, you will receive an error, or at the very least not go to the same page whose URL you originally entered. The decoded URL is also saved in the history, so that you can, for example, use the up and down arrow keys in the address bar to select a previously visited page, and not go there, because the URL that has been saved with it is not the right URL, but the urldecoded version of it. This behavior annoys me. I can see the point of wanting to display the address in decoded form for some users in some situations (e.g. when using Konqueror as a file manager - which I never do, by the way). However, I would, at the very least, want to be able to turn off this functionality, so that the URLs I enter will not be mangled. Steps to reproduce: - Visit any website with an URL that contains characters that need escaping. For example: http://slashdot.org/~RAMMS%2BEIN/ - Konqueror will correctly open the page, but mangle the URL. E.g. http://slashdot.org/~RAMMS+EIN/ - If you try to open the same page again, e.g. by selecting the address bar and pressing return, or by selecting the address bar, you will not go to the same page you originally visited. - If you visit another page, then select the address bar and use the up arrow to navigate back to the original page, then press return to select it, you will get the mangled URL and you will not visit the page whose URL you originally entered.
We follow the RFC to the letter on this point. Slashdot is broken. This needs to be common bug in webservers to start violating the RFC. RFC 3986: 6.2.2.2. Percent-Encoding Normalization The percent-encoding mechanism (Section 2.1) is a frequent source of variance among otherwise identical URIs. In addition to the case normalization issue noted above, some URI producers percent-encode octets that do not require percent-encoding, resulting in URIs that are equivalent to their non-encoded counterparts. These URIs should be normalized by decoding any percent-encoded octet that corresponds to an unreserved character, as described in Section 2.3.
I confirm, we send a different HTTP GET when typing %2B or + in the location bar, because KUrl/QUrl keeps it as is. On the other hand I can't say if that's a bug or not. (Surely '+' in a path is not ambiguous, '+' has a special meaning only in queries) Thiago: should QUrl encode '+' in paths? This bug has 3 possible outcomes, I don't know enough to decide: 1) QUrl::setEncodedUrl(TolerantMode) should encode '+' in paths 2) KUrl::prettyUrl shouldn't make '+' pretty 3) slashdot is indeed broken I made a local patch (+unittest) for 2), but it breaks the prettiness somewhat.
from qurl.cpp, which is reporting from RFC 3986: #define ABNF_sub_delims "!$&'()*+,;=" #define ABNF_pchar ABNF_sub_delims ":@" static const char pathExcludeChars[] = ABNF_pchar "/"; So + don't have to be encoded in path components. In other words, treating %2B differently from + in path components is a bug in the server. Note that the slash is special. %2F is not the same as /. Finally, in the query, from the URI spec's point of view, %2B and + *are* the same. This is caused by HTML FORM convention, not part of the RFC. URLs should internally always keep their encoded forms. QUrl already does that, which is proven by the fact that the original report says you can visit those pages. Since I disagree with the reporter's assertion that the URL displayed is not valid, I would close this bug as INVALID.
Okay, thank you for having a look at this.