Summary: | UTF-8 encoding not used for XMLHttpRequest | ||
---|---|---|---|
Product: | [Applications] konqueror | Reporter: | Adam Peller <adam+kdebugs> |
Component: | khtml | Assignee: | Konqueror Developers <konq-bugs> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | maksim |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Ubuntu | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | Proposed patch |
Description
Adam Peller
2006-07-04 05:32:08 UTC
Created attachment 18068 [details] Proposed patch I confirm the same behaviour. Unfortunately, for speakers of languages with non-latin alphabets, this can be a bit of a problem, since AJAX replies are now by default rendered using iso8859-1, ending up completely jammed on screen. UTF-8 or UTF-16 marked by BOM are in general the standard encodings for XML documents, when no explicit encoding specification is present. Apart from that, the W3C specification of XML 1.0 ( http://www.w3.org/TR/xml/#charencoding) mandates that UTF-16 encoded XML documents be always marked with a BOM, whereas UTF-8 may optionally have a BOM. IMHO it should default to UTF-8, since this is the expected behaviour by most web applications. Since khtml::Decoder::decode always looks for a BOM at the beginning of the stream, setting the default encoding of XMLHttpRequest replies to UTF-8 guarantees that it will always work with UTF-8, UTF-8 w/ BOM and UTF-16 w/ BOM. I'm not familiar with the internals of KDE, but the following patch fixes the issue for me. Still i'm not sure about the use of the Decoder::DefaultEncoding constant or whether something else should be used instead. Cheers, Apollon *** This bug has been confirmed by popular vote. *** Also, please note that content other than XML may be passed over XHR. In Dojo's case, we pass JS which we eval, so putting a BOM at the top is not an option. We did something far uglier for a workaround... "it seems like would be able to get away with: /* <?xml version="1.0" encoding="UTF-8" ?> */ in the top of your translation files" Which appears to work as a side effect of the parser sniffing for encoding headers. Can the patch get reviewed and approved for 3.5.7? I've tried Kubuntu 7.04 right now and it seems this bug is fixed in it, while in my ArchLinux - not. r718830 | adawit | 2007-09-29 16:20:38 -0400 (Sat, 29 Sep 2007) | 5 lines * Default to "UTF-8" per section 2 of the draft W3C "The XMLHttpRequest Object" specification. Fixes BR# 130234 BUG:130234 Index: xmlhttprequest.cpp =================================================================== --- xmlhttprequest.cpp (revision 657077) +++ xmlhttprequest.cpp (revision 718830) @@ -674,7 +674,8 @@ if (!encoding.isNull()) decoder->setEncoding(encoding.latin1(), Decoder::EncodingFromHTTPHeader); else { - // FIXME: Inherit the default encoding from the parent document? + // Per section 2 of W3C working draft spec, fall back to "UTF-8". + decoder->setEncoding("UTF-8", Decoder::DefaultEncoding); } } if (len == 0) |