Bug 110768

Summary: incorrect encoding using AJAX (xmlhttprequest)
Product: [Applications] konqueror Reporter: Anton <anton.bugs>
Component: khtml ecmaAssignee: Konqueror Developers <konq-bugs>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Other   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: 108401.png
the bug is kde 3.4.3

Description Anton 2005-08-14 17:01:20 UTC
Version:            (using KDE KDE 3.4.1)
Installed from:    Compiled From Sources
Compiler:          gcc version 3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1) gentoo use flags: (-altivec) -bootstrap -boundschecking -build +fortran -gcj +gtk -hardened -ip28 (-multilib) -multislot (-n32) (-n64) +nls -nocxx -nopie -nossp -objc -static -vanilla
OS:                Other

Hi,

My web server displays a content using windows-1251 encoding.
I'm trying to request a new html page using AJAX method:
var req = new XMLHttpRequest();
My page(just a plane/text without any tegs) is in windows-1251 as well, but konqueror trying to display it using koi8-r.

It might be the same issue on http://www.google.com/ig where i'm trying to read russian news (bugtraq.ru) but they displays with "??????" only.

I tried to specified manual encoding everythere but it didn't help.
Am I doing something wrong or it is a bug?..

kde 3.4.1 compiled from sources/Linux 2.6.12-gentoo-r4

Regards,
Anton
Comment 1 Thiago Macieira 2005-08-15 01:35:11 UTC
We have recently fixed similar issues. Can you provide us a test case, or can you test KDE 3.5.0 alpha 1?
Comment 2 Anton 2005-08-15 13:54:59 UTC
Here is the test case. Please also try to add bugtraq.ru in
http://www.google.com/ig.

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">
<script type='text/javascript'>
var myAjax;

function myprocess(){
    document.getElementById("status").innerHTML = "status: "+ myAjax.readyState;
    if (myAjax.readyState == 4) {
      if ( (myAjax.status == 304) || (myAjax.status == 200) ) {
        document.getElementById("status").innerHTML = "ResponceText: "+ myAjax.responseText;
      }else 
         alert("There was a problem retrieving the XML data:\n" + myAjax.statusText);
    }
}

window.onload = function(){

  // branch for native XMLHttpRequest object
  if(window.XMLHttpRequest) {
    try {
      myAjax = new XMLHttpRequest();
      } catch(e) {
        myAjax = false;
    }
  // branch for IE/Windows ActiveX version
  } else alert("IE sux");

  if(!myAjax) alert("XMLHttpRequest failed.");

  myAjax.onreadystatechange = myprocess;
  myAjax.open("GET","http://195.38.160.45/ajax.html",true);
  myAjax.send("");
}

</script>
</head>

<body>
Body windows-1251 text: тест<hr>
 <em><div id="status">Loading page...</div></em>

</body>
</html>
Comment 3 Thiago Macieira 2005-08-15 16:09:20 UTC
I get the following with your test case:

Body windows-1251 text: ???µ????
ResponceText: windows-1251: òåñò

If I load the webpage in Konqueror, I see:
windows-1251: ????

The header reply was:
kio_http: (2060) ============ Received Response:
kio_http: (2060) "HTTP/1.1 200 OK"
kio_http: (2060) "Server: Netscape-Enterprise/4.1"
kio_http: (2060) "Date: Mon, 15 Aug 2005 13:18:05 GMT"
kio_http: (2060) "Content-type: text/html; charset=windows-1251"
kio_http: (2060) "Etag: "a14843d9-72d0-12-4300752e""
kio_http: (2060) "Last-modified: Mon, 15 Aug 2005 10:57:50 GMT"
kio_http: (2060) "Content-length: 18"
kio_http: (2060) "Accept-ranges: bytes"

So I can confirm your bug in 3.5 r449049
Comment 4 Anton 2005-09-18 01:09:14 UTC
Yeh,

this is in the file xmlhttprequest.cpp:

  if ( decoder == NULL ) {
    decoder = new Decoder;
    if (!encoding.isNull())
      decoder->setEncoding(encoding.latin1(), Decoder::EncodingFromHTTPHeader);
    else {
      // FIXME: Inherit the default encoding from the parent document?
    }
  }

Can somebody tell how to make a quick fix? I just want to have one static encoding?..
Comment 5 Anton 2005-09-22 17:17:46 UTC
I found the bug report where you really has fixed the same issue:
http://bugs.kde.org/show_bug.cgi?id=108400

But I believe the patch is not complite since it doesn't work with the latest version.

You might forgotten to replace the FIXME line with:

decoder->setEncoding(encoding, Decoder::EncodingFromHTTPHeader);

or something like that.
Please double check.
Comment 6 Dawit Alemayehu 2005-10-13 02:02:30 UTC
Hi Anton,

On Wednesday 12 October 2005 12:01, you wrote:
> I guess I'm wrong and can't apply the patch for 3.4.x version.
> The snapshot looks alright.


Yes. But your bug might be caused by something else. For example, I do not use 
the same kio_http as you or Thiago. 

> I can't try 3.5 with the patch yet. Sorry for bothering.


No problem. I suspect the bug might be elsewhere and not in the XMLHttpRequest 
implementation. I will try to locally setup the test case you provided on the 
bug report and see if it works. It would be even easier for me to test if you 
put up the whole test case online. :) That way I can test it on your setup. 
Anyways, if I click on the GET link  "http://195.38.160.45/ajax.html" you 
provided in the test case, I think get the correct rendering of the page. See 
the attached snapshot.


Created an attachment (id=12964)
108401.png
Comment 7 Anton 2005-10-14 08:19:37 UTC
Created attachment 12981 [details]
the bug is kde 3.4.3

I uploaded the test case here:
http://195.38.160.45/ajax_tc.html
and tried with just release 3.4.3:
The bug is still there.
See attachment.
Comment 8 Dawit Alemayehu 2005-10-15 02:11:50 UTC
SVN commit 470753 by adawit:

- Handle HTTP response headers case insensitively. Fix for bug 110768.

BUG: 110768


 M  +10 -12    xmlhttprequest.cpp  


--- branches/KDE/3.5/kdelibs/khtml/ecma/xmlhttprequest.cpp #470752:470753
@@ -551,7 +551,6 @@
       int codeEnd = responseHeaders.find("\n", codeStart+3);
       if (codeEnd != -1)
         responseHeaders.replace(codeStart, (codeEnd-codeStart), "200 OK");
-      // qDebug("Response Header: %s", responseHeaders.latin1());
     }
 
     changeState(Loaded);
@@ -563,18 +562,17 @@
 #endif
 
   if ( decoder == NULL ) {
-     int pos = responseHeaders.find("Content-Type:");
-     if ( pos > -1 )
-     {
-        int index = responseHeaders.find('\n', pos+13);
-        QString type = responseHeaders.mid(pos+13, index);
-        // qDebug("XMLHttpRequest::slotData: 'content-type = %s'", type.latin1());
-        index = type.find (';');
-        if (index > -1)
-          encoding = type.mid( index+1 ).remove(QRegExp("charset[ ]*=[ ]*", false)).stripWhiteSpace();
-        // qDebug("XMLHttpRequest::slotData: 'encoding = %s'", encoding.latin1());
-     }
+    int pos = responseHeaders.find("content-type:", 0, false);
 
+    if ( pos > -1 ) {
+      pos += 13;
+      int index = responseHeaders.find('\n', pos);
+      QString type = responseHeaders.mid(pos, (index-pos));
+      index = type.find (';');
+      if (index > -1)
+        encoding = type.mid( index+1 ).remove(QRegExp("charset[ ]*=[ ]*", false)).stripWhiteSpace();
+    }
+
     decoder = new Decoder;
     if (!encoding.isNull())
       decoder->setEncoding(encoding.latin1(), Decoder::EncodingFromHTTPHeader);