Summary: | ajax in UTF-8 doesn't parse cyrillic encoding properly (google example) | ||
---|---|---|---|
Product: | [Applications] konqueror | Reporter: | Anton <anton.bugs> |
Component: | khtml ecma | Assignee: | Konqueror Developers <konq-bugs> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | CC: | maksim |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Gentoo Packages | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Anton
2007-01-13 05:22:59 UTC
I have this behaviour in zooomr, when I try to name the photos. So I can confirm this bug (ArchLinux, Slackware). But in Kubuntu it works as I can remember. First of all, you want to be using at least 3.5.6, since that fixes problems with unicode support in regular expressions (and you need to make sure your libpcre has utf-8 support --- unfortunately it's possible for its support to be missing at runtime, even ...). That's quite likely the difference. The bugtraq.ru widget works for me, so I can't confirm. Of course, if you have a testcase, that would let me be sure... I have tried kde 3.5.7 and the problem has gone. Igor, please try the latest version too. I have the latest stable version, 3.5.7. BugTraq.ru and GMail work for me also, but Zooomr doesn't work properly. But after refreshing the page the text is in proper encoding and a can read it. Here is a screenshot: http://img451.imageshack.us/img451/9082/snapshot1ez6.png Can you describe steps before refreshing page as well, so we could reproduce it and create a test case please? Do I need to have a Zooomr account? Do I need to upload picture? You need to register on Zooomr.com, then upload a picture, then you simply give it a name or tag (in russian), then you will see this behaviour. Then you refresh the page and see the proper text. ok, I managed to reproduce. Here is the JSON respond from the server: HTTP/1.0 200 OK Connection: keep-alive Status: 200 OK, 200 OK Content-Language: en x-zmr-token: 238256 Vary: Accept-Language, Cookie server: ZAPI/0.9r3, lighttpd/1.4.18 date: Tue, 18 Sep 2007 07:54:37 GMT, Tue, 18 Sep 2007 07:54:37 GMT Content-Type: text/javascript Content-length: 1364 Keep-Alive: timeout=30, max=100 {"photo": {"sizemax": 16, "description": {"_content": "тест ТЕСТ ТЕСТ"}, <the rest of long respond in unicode skipped> As you can see, the webserver does not return encoding specs like ";charset=UTF-8" in the content-type header. Maksim, shouldn't it be UTF-8 by default if not specified?.. Seems like it. Except I can't even figure out how the heck it sets it for xml by default. Encoding detection is icky. So is it a bug in konqueror or in the server? I was involved in fixing of another ajax encoding bug and know that you mean. But I guess it's time to fix FIXME lines :) kdelibs-3.5.7/khtml/ecma/xmlhttprequest.cpp: decoder = new Decoder; if (!encoding.isNull()) decoder->setEncoding(encoding.latin1(), Decoder::EncodingFromHTTPHeader); else { // FIXME: Inherit the default encoding from the parent document? } Not 100% sure if it's the right place. My konqueror's encoding settings is "system's default" and system uses UTF8. Igor: I'm not sure exactly. I think both, because server should tell your browser that the encoding is and the browser should use the default settings from settings->fonts->default encoding. I am not sure the comment is correct, since the proposed XMLHttpRequest spec gives a whole honking algorithm for determining things, that's dependent on content-type and everything. Lots of stuff in that module needs cleanup badly.. :( hm... since my original issue has been fixed I'm marking it as "Resolved". The issue with Zooomr is duplicating the bug http://bugs.kde.org/show_bug.cgi?id=130234. You are right about major cleanup and I'll leave it with you. Thanks for the help ;-) ps FYI. I reported about similar bug with encoding about 2 years ago in bug http://bugs.kde.org/show_bug.cgi?id=110768 and that's why we have extra detection code (mess) over there. |