Version: (using KDE KDE 3.5.5) Installed from: Gentoo Packages Compiler: gcc version 4.1.1 (Gentoo 4.1.1-r3) CFLAGS="-O2 -march=pentium4m -pipe -msse2 -mfpmath=sse" OS: Linux Subj. A javascript alert message box with received AJAX data should be a very simple testcase. This is different from the bug 130234 because server gives proper UTF-8 encoding HTML header and doesn't display it properly. I experience it with google wadgets available from personal google page http://www.google.com/ig (both bugtracks and gmail, see screenshoot) http://img145.imageshack.us/img145/3959/konqgoogleiglt7.png and also with my own application. Let me know If you need a real testcase.
I have this behaviour in zooomr, when I try to name the photos. So I can confirm this bug (ArchLinux, Slackware). But in Kubuntu it works as I can remember.
First of all, you want to be using at least 3.5.6, since that fixes problems with unicode support in regular expressions (and you need to make sure your libpcre has utf-8 support --- unfortunately it's possible for its support to be missing at runtime, even ...). That's quite likely the difference. The bugtraq.ru widget works for me, so I can't confirm. Of course, if you have a testcase, that would let me be sure...
I have tried kde 3.5.7 and the problem has gone. Igor, please try the latest version too.
I have the latest stable version, 3.5.7. BugTraq.ru and GMail work for me also, but Zooomr doesn't work properly. But after refreshing the page the text is in proper encoding and a can read it. Here is a screenshot: http://img451.imageshack.us/img451/9082/snapshot1ez6.png
Can you describe steps before refreshing page as well, so we could reproduce it and create a test case please? Do I need to have a Zooomr account? Do I need to upload picture?
You need to register on Zooomr.com, then upload a picture, then you simply give it a name or tag (in russian), then you will see this behaviour. Then you refresh the page and see the proper text.
ok, I managed to reproduce. Here is the JSON respond from the server: HTTP/1.0 200 OK Connection: keep-alive Status: 200 OK, 200 OK Content-Language: en x-zmr-token: 238256 Vary: Accept-Language, Cookie server: ZAPI/0.9r3, lighttpd/1.4.18 date: Tue, 18 Sep 2007 07:54:37 GMT, Tue, 18 Sep 2007 07:54:37 GMT Content-Type: text/javascript Content-length: 1364 Keep-Alive: timeout=30, max=100 {"photo": {"sizemax": 16, "description": {"_content": "тест ТЕСТ ТЕСТ"}, <the rest of long respond in unicode skipped> As you can see, the webserver does not return encoding specs like ";charset=UTF-8" in the content-type header. Maksim, shouldn't it be UTF-8 by default if not specified?..
Seems like it. Except I can't even figure out how the heck it sets it for xml by default. Encoding detection is icky.
So is it a bug in konqueror or in the server?
I was involved in fixing of another ajax encoding bug and know that you mean. But I guess it's time to fix FIXME lines :) kdelibs-3.5.7/khtml/ecma/xmlhttprequest.cpp: decoder = new Decoder; if (!encoding.isNull()) decoder->setEncoding(encoding.latin1(), Decoder::EncodingFromHTTPHeader); else { // FIXME: Inherit the default encoding from the parent document? } Not 100% sure if it's the right place. My konqueror's encoding settings is "system's default" and system uses UTF8. Igor: I'm not sure exactly. I think both, because server should tell your browser that the encoding is and the browser should use the default settings from settings->fonts->default encoding.
I am not sure the comment is correct, since the proposed XMLHttpRequest spec gives a whole honking algorithm for determining things, that's dependent on content-type and everything. Lots of stuff in that module needs cleanup badly.. :(
hm... since my original issue has been fixed I'm marking it as "Resolved". The issue with Zooomr is duplicating the bug http://bugs.kde.org/show_bug.cgi?id=130234. You are right about major cleanup and I'll leave it with you. Thanks for the help ;-) ps FYI. I reported about similar bug with encoding about 2 years ago in bug http://bugs.kde.org/show_bug.cgi?id=110768 and that's why we have extra detection code (mess) over there.