Bug 79065 - External CSS style-sheets default to wrong charset
Summary: External CSS style-sheets default to wrong charset
Status: RESOLVED FIXED
Alias: None
Product: konqueror
Classification: Applications
Component: khtml parsing (show other bugs)
Version: unspecified
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: Konqueror Developers
URL:
Keywords:
: 100993 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-04-04 21:20 UTC by Thiago Macieira
Modified: 2005-03-22 00:15 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Test page, HTML 4.01 Transitional, UTF-8 (397 bytes, text/html)
2004-04-04 21:21 UTC, Thiago Macieira
Details
Test CSS stylesheet, UTF-8 encoded (66 bytes, text/css)
2004-04-04 21:22 UTC, Thiago Macieira
Details
Attempt at fixing the problem (2.91 KB, patch)
2004-04-05 00:01 UTC, Thiago Macieira
Details
Second attempt at fixing (3.00 KB, patch)
2004-04-14 05:52 UTC, Thiago Macieira
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thiago Macieira 2004-04-04 21:20:45 UTC
Version:           3.2.0 (using KDE 3.2.90 (CVS >= 20040117), compiled sources)
Compiler:          gcc version 3.3.3
OS:          Linux (i686) release 2.6.3

When a webpage (HTML and XHTML) references an external style-sheet through a <LINK> reference, the charset for the loaded file is incorrectly set: it defaults to ISO-8859-1 (Latin 1), even if metadata from the server specifies a different encoding.

The attached testpage (valid HTML 4.01 Transitional) demonstrates this error. The external stylesheet when loaded like this:
    <link rel="StyleSheet" type="text/css" href="test.css">

Causes the text to appear in Konqueror:
	This should appear « quoted ». And this is a test of UTF-8: €. 

Changing the load line to the following:
    <link rel="StyleSheet" type="text/css" charset="utf-8" href="test.css">

Causes the text to appear as it should (and as it does in Mozilla):
	This should appear « quoted ». And this is a test of UTF-8: €. 

Note: my locale is UTF-8, so all files are supposed to be loaded UTF-8 (as the webpage showing the Euro symbol demonstrates). Also, when retrieving the webpage from a server, I get:
kio_http: (918400) "Content-Type: text/css; charset=utf-8"
Comment 1 Thiago Macieira 2004-04-04 21:21:32 UTC
Created attachment 5531 [details]
Test page, HTML 4.01 Transitional, UTF-8
Comment 2 Thiago Macieira 2004-04-04 21:22:11 UTC
Created attachment 5532 [details]
Test CSS stylesheet, UTF-8 encoded
Comment 3 Thiago Macieira 2004-04-04 22:25:44 UTC
The functions at fault are:
	CachedObject::codecForBuffer (khtml/misc/loader.cpp)
	DocLoader::requestStyleSheet (same)

Nowhere in misc/loader.cpp does it try and get the charset from the KIO metadata.
Comment 4 Thiago Macieira 2004-04-05 00:01:35 UTC
Created attachment 5535 [details]
Attempt at fixing the problem

The attached patch fixes the problem for me, both for remote files and local
ones. It does:

- move the m_charset member from khtml::CachedCSSStyleSheet and
khtml::CachedScript into khtml::CachedObject. It won't be used, of course, for
images (khtml::CachedImage).

- in khtml::Loader::slotFinished, query the metadata from the job before
calling r->object->data. In case of local files, use the charset from
QTextCodec::codecForLocale
Comment 5 Thiago Macieira 2004-04-14 05:52:00 UTC
Created attachment 5632 [details]
Second attempt at fixing

The previous patch made the server charset parameter override the user's. This
one inverts that logic.
Comment 6 Thiago Macieira 2005-03-06 22:14:30 UTC
*** Bug 100993 has been marked as a duplicate of this bug. ***
Comment 7 Allan Sandfeld 2005-03-22 00:15:04 UTC
CVS commit by carewolf: 

Make charset in <link> actually mean something. Patch is simplified version 
of one by Thiago Maciera
BUG: 79065


  M +4 -0      ChangeLog   1.408
  M +2 -2      misc/loader.cpp   1.181


--- kdelibs/khtml/ChangeLog  #1.407:1.408
@@ -1,2 +1,6 @@
+2005-03-22  Allan Sandfeld Jensen <kde@carewolf.com>
+
+        * misc/loader.cpp: Do not override existing charset with an empty one.
+
 2005-03-21  Allan Sandfeld Jensen <kde@carewolf.com>
 

--- kdelibs/khtml/misc/loader.cpp  #1.180:1.181
@@ -968,5 +968,5 @@ CachedCSSStyleSheet *DocLoader::requestS
 
     CachedCSSStyleSheet* s = Cache::requestObject<CachedCSSStyleSheet, CachedObject::CSSStyleSheet>( this, fullURL, accept );
-    if ( s ) {
+    if ( s && !charset.isEmpty() ) {
         s->setCharset( charset );
     }
@@ -981,5 +981,5 @@ CachedScript *DocLoader::requestScript( 
 
     CachedScript* s = Cache::requestObject<CachedScript, CachedObject::Script>( this, fullURL, 0 );
-    if ( s )
+    if ( s && !charset.isEmpty() )
         s->setCharset( charset );
     return s;