Bug 39185 - Lack of per-script font selection makes reading pages in script not covered by the main font difficult.
Summary: Lack of per-script font selection makes reading pages in script not covered b...
Status: CONFIRMED
Alias: None
Product: systemsettings
Classification: Applications
Component: kcm_fonts (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR wishlist
Target Milestone: ---
Assignee: rik
URL:
Keywords:
: 14496 56338 62804 79814 108201 314805 (view as bug list)
Depends on:
Blocks:
 
Reported: 2002-03-10 17:33 UTC by Neil Stevens
Modified: 2021-05-30 06:42 UTC (History)
15 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Comparing Konqi and Mozilla (133.28 KB, image/png)
2004-04-25 16:07 UTC, jeff pitman
Details
Not only konqi, but this is a system-wide problem. (28.89 KB, image/png)
2004-05-03 02:12 UTC, jeff pitman
Details
Konqueror 3.3.1 display Chinese page (87.96 KB, image/png)
2005-02-10 17:39 UTC, Nan Zou
Details
Screen shot Google-HK (89.31 KB, image/png)
2005-02-10 21:16 UTC, James Richard Tyrer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Neil Stevens 2002-03-10 17:27:53 UTC
(*** This bug was imported into bugs.kde.org ***)

Package:           konqueror
Version:           KDE 2.9.2 CVS/CVSup/Snapshot
Severity:          normal
Installed from:    Compiled sources
Compiler:          gcc 2.95.3
OS:                Other
OS/Compiler notes: Not Specified

I run my desktop in English with iso8859-1 character set.  However I often have the need to view pages in other languages in other character sets.

In KDE 2 I could configure what fonts were used for those others which is nice as different fonts scale differently and are differently legible.

In KDE 3 this setting seems to be completely gone and often I'm left totally unable to view say euc-jp encoded pages.  Sometimes Konqueror seems to be picking some other random font that supports the relevant codec but sometimes it doesn't.

What's going on?

(Submitted via bugs.kde.org)
Comment 1 shu 2002-04-24 00:42:24 UTC
The ability to choose different fonts for different character sets seems
to have been removed from KDE 3.x

This is a major detriment to the i18n effort for KDE and if anything
regresses instead of progresses.

Simple example:
Japanese stock fonts in X are not exactly good and they are not
TrueType which makes them worse. A user decides to view
http://www.apple.co.jp konqueror decides to choose those stock fonts
for it making it not exactly to the user's liking. The user finds out
he cannot choose Arial Unicode MS or MS Gothic or similar Japanese
TrueType fonts for his charset such as Shift JIS.

It is *logical* for the user to be able to choose what fonts he wants
and how he wants them.

-- 
"c'est la sel fantasie ici pour toujours"
Shu-yu Guo <shu@rufuran.org>
Comment 2 Neil Stevens 2002-06-01 00:28:09 UTC
So is this a "won't fix?" or what?
-- 
Neil Stevens - neil@qualityassistant.com
"I always cheer up immensely if an attack is particularly wounding
because I think well if they attack one personally it means they
have not a single political argument left." - Margaret Thatcher
Comment 3 Lars Siebold 2002-06-28 12:20:03 UTC
This feature was really helpful and without it many international pages 
simply are not viewable or only with lots of adjusting settings all the time.
Or alt least some limeted replacement for the feature would be really helpful 
or at least a better picking of the right font for an encoding.
Comment 4 Dirk Mueller 2002-09-25 04:55:20 UTC
qt should pick the right font automatically. are there cases where this 
doesn't work ? 
Comment 5 Neil Stevens 2002-09-25 05:03:22 UTC
It never worked for me, for Japanese pages.  To read them, I have to go and  
manually change my Konqueror font!  
  
And as Shu points out, what is the "right" font?  A font that's fine for Latin  
letters might be inferior for Kanjis.  So even if Qt did do anything, there's  
no way it could know what fonts are more legible under what circumstances, so 
the loss of functionality from KDE 2 is still missed. 
Comment 6 Chris Carlen 2002-11-27 06:41:17 UTC
Here are some clips from my posts about this subject.  I think there is a
serious case for adding back in to KDE/Qt3 the capability for the user to
manually select encodings for fonts.

My problem is that Thai file names and Thai text files can be typed in Thai, but
when displayed (filenames) in Konqueror, show as ?????????? and when read back
from disk (text files) show as ?????????????
------------------------------------------------------------------------
In responding to someone saying that Qt3 handles this font encoding business
automagically, I said:

I don't see how it can be automatic since 8-bit encodings are ambiguous as to
which character set the upper group should be mapped to.  Web pages and certain
text-based media forms might specify this in a special field, but what about
something like a filename in the filesystem?

(I have since learned about the KDE_UTF8_FILENAMES=1, which I haven't tried yet,
but I don't think this will solve the text file problem, nor does it help the
general issue of a reduction of user control over his/her computing experience.)

That is the problem we're having right now, is that Thai filenames in Konqueror
are showing as some other language.  We have chosen Tahoma font for all KDE
fonts except fixed, and for the Konqueror browser font.  But Tahoma contains
several glyph sets from other languages besides Thai.  So how does Qt know which
one we want?  There is nothing in the 8-bit information in the filename to give
it a clue which encoding to use.

In KDE 2, we told it to use Thai encoding for Tahoma font, which made the choice
clear. 
-------------------------------------------------------------------------
I set my font to Tahoma for all KDE fonts except fixed width, and in Konqueror.
 Next I "create new text file" in Konqueror.  When the dialog appears to type in
the file name, I switch to Thai keyboard.  Ok so far.  I type in a filename. 
The characters are Thai!  (That is good.)

Now I hit enter to make the file, and konqueror shows the filename as
"???????????????"  What's even stranger is I can right click it and select
"Properties" and the question marks appear in the Properties dialog.

At this point I can type Thai Ok in most input fields within KDE.

Now here's another interesting observation that I think confirms what I am
suspecting to be the problem here:

I created a text file in kwrite.  I set the Tahoma font for kwrite.  I can type
in Thai!  Then I saved the file.  Then I opened the file in kwrite.  What
happens next is predictable:

The characters show up all as "?????????????????????????"

Qt has no way of knowing what encoding to use for this text.  Thus the removal
of the ability for the user to select the encoding to use for a font, and to
remove a nice GUI selection box for this, is quite a silly thing to have been
done.  It is Windows like.  "We will decide what you need, we know better than
you..." 
-----------------------------------------------------------------------
I have come to the understanding at this point that there isn't really a bug,
there is just a problem of the developers thinking they could make a program
smart, and didn't realize that this would impede the user from doing something
that is completely logical once you realize that someone would want to do it,
but you might not have imagined that it would affect anyone when you decided to
make the software smart.

I would have hoped that we had learned enough from Windows about what happens
with smart software.

But anyway, as I see it now:

Selecting the Thai keyboard causes KDE/Qt3 to use the Thai encoding (I suppose
it is using unicode internally, which is Ok, since our favorite font Tahoma
works with that).  So I can type Thai fine in KDE dialogs and applications.

But a filename or a text file written in Thai, only displays as ??????????
because Qt has no way of knowing which character set to use for 8-bit text, or
the 8-bit characters of a filename.  Thus it displays ???????????.  This problem
was neatly avoided in KDE 2 by simply setting the encoding for the font.  If we
chose iso8859-11 then Thai characters would appear for the upper group of 8-bit
text data.  Choosing another encoding would display another character set.

There must be some way to tell KDE what character set to use to display 8-bit
text correctly.  For media like email and HTML, this is no problem as there are
special headers or fields that can specify the character set.  But for a plain
text file and for filenames in the filesystem, there must be a way for the user
to control the encoding selection.

Even still there are times when one wants to override the character set choices
made in a web browser or email client, and tell the application to use an
encoding selected by the user.  Basically, reducing the amount of manual choices
available to a user is *always* a bad thing.  It seems like we are headed in
that direction though.  If automagic features are being added to KDE, they
should be included as a new choice in the manual choices list.  The auto can be
made default.  But to take away the ability to manually override, is just un-Linux! 
Comment 7 James Richard Tyrer 2002-11-27 13:26:34 UTC
Part of Chris' problem appears to be that something is broken.  I tried saving
files with other than iso8859-1 (specifically iso8859-11) and the files contain
only "?" and " ", but this needs to be another bug which I will open.

The Apple Japan page displays perfectly for me.  Both Kanta and Kanjis
characters are AA and scalable.  But which font am I using?  I installed both
Arial Unicode and CyberBit so I have two Unicode fonts available.  One is serif
and one sans-serif so I presume that the choice is made on that basis.  But what
if I had more UniCode fonts but still chose a font that had many encodings?  I
am using Arial MT for my default font so there is no problem.  Qt is probably
choosing Arial Unicode for the Kanjis characters.  But in other combinations it
might not look very good.

I suggest that although it is good to have all UniCode (if it worked) but we are
not there yet.  And even if we were that users might prefer to be able to choose
different fonts for different encodings.  Automatic is a good idea as a default,
but it should also be possible to override the default and choose different
fonts for different character encodings if the user wished to do so.

Second: MAJOR PROBLEM, what are you supposed to do if somebody sends you an 8
bit file in an encoding different than the default for the language you are
using?  How are you supposed to open it?  There is no way to set the endcoding
on the editors or their part that appears in Konqueror.

And if the editors support 16 bit characters, then there needs to be somewhere
to turn it on and off.

Product changed to KDELibs since this is a problem that affects the entire desktop.
Comment 8 James Richard Tyrer 2002-11-27 13:28:58 UTC
I'm just guessing that this is a KLocale problem.
Comment 9 Neil Stevens 2002-11-27 16:26:12 UTC
Subject: Re: Why was this feature removed?

On Wednesday November 27, 2002 01:11, Lars Knoll wrote:
> The report has some valid points, but the wrong solutions in mind.
> Character encodings as tis-620 or 8859-1 should die in the user
> interface where not needed (ie for everything else than reading and
> writing data streams). What we're really talking about is scripts. Eg.
> specifying a default font for the Japanese script or the Thai script.
> The user should not have to care about the fact that the font is encoded
> in tis-620 or iso10646-1.

Fine, make it configurable by script instead of by encoding.  That's 
perfectly fine by me.  In fact, that's likely an improvement over the KDE 
2 way.

> Back to the font problem: People _do_ however know about the difference
> between Thai and Latin as scripts, and would like to be able to select a
> set of default/fallback fonts for a certain script.

Why be so limiting?  In KDE 2 khtml, I could pick *every* font for *every* 
language.  The same ought to be done for KDE 3.  (And, in fact, for Chris 
Carlen, we should consider doing it for all KDE fonts, but I think that 
would additionally require some Qt support in the text-displaying widgets, 
so I won't count on it).

> This is a problem of Qt and it's known to us. We are currently trying to
> find a fix for these issues. Unfortunately X11s broken font system makes
> it not exactly easy to do so, especially on old commercial Xservers.

No, the bug I reported is discussing the removal of a khtml feature.  That 
is not a Qt matter.  You, as a Trolltech developer, may see possibilities 
for improvement in Qt in this area, and that's great.  But khtml managed 
to get this working in KDE 2, so it ought to be able to get it right in 
KDE 3 sometime.

Comment 10 James Richard Tyrer 2002-12-03 14:41:23 UTC
I'm closing this because the original tile is not a valid problem.

I have experimented with Kwrite/Kate and found that it mostly works with
"alternate character sets" in 3.1 Release but there are specific minor problems
for which I will open a bug for Kate.

One suggestion: There should be some i18n help somewhere as this is not a simple
thing to understand.
Comment 11 Neil Stevens 2002-12-04 00:08:28 UTC
Subject: Re:  choosing alternate character set fonts doesn't work

Good for you guys, to take my advice and ignore the problem. Should save 
you a lot of hassle.

Have fun!

Comment 12 James Richard Tyrer 2002-12-05 01:11:37 UTC
Bug reopened at the request of: Andy Fawcett.

However since I am unable to replecate the original report with the 3.1 release,
it will be closed again soon.

Specifically, the http://www.apple.co.jp renders perfectly if you have a
scalable Japanese (Kanjis) font installed.  Such fonts now come with XFree86 --
the 'Luxi' fonts.

This is NOT a Konqueor problem since KDELibs calls Qt to select the fonts.  And,
I understand that you CAN configure Qt to use different fonts as you want. 
There isn't a configure widget for this and perhaps that is a valid wish list item.

Would you please supply the URLs of 2 pages where you have the problem, and I
will look into it further.
Comment 13 Neil Stevens 2002-12-05 01:16:51 UTC
Subject: Re:  choosing alternate character set fonts doesn't work

On Wednesday December 04, 2002 04:11, James Richard Tyrer wrote:
> This is NOT a Konqueor problem since KDELibs calls Qt to select the
> fonts.  And, I understand that you CAN configure Qt to use different
> fonts as you want. There isn't a configure widget for this and perhaps
> that is a valid wish list item.

It is a feature that was deliberately removed in favor of the Qt 
mechanisms.  That decision cannot be blamed on Qt.  Lars explained the 
motivation.  Please go read it, instead of denying everything.

Comment 14 Stephan Kulow 2003-03-10 20:46:13 UTC
TT is reported to work on a solution for it 
Comment 15 Ken Deeter 2003-11-07 11:18:05 UTC
bug 56338 and bug 62804 are dups of this one.

There is more discussion on 62804 as to why we need to go back to a font dialog before.

In short, the Qt mechanism is just not enough. One problem is that it relies on unicode. For unicode, han charachter unification has resulted in the same codepoints for characters that *should* look different, based on whether you are looking at a Chinese page or a Japanese page. The different fonts of the different languages have different glyphs for the *same* code point, and thus there is not much hope and being able to pick the right font, because it might be a user preference.

Also, Qt's font replacement mechanism is fundamentally flawed in that it only checks for ranges of characters to be replaced instead of each one individually. This means that if you select a japanese font (and set up a chinese font as its replacement), then view a chinese page, because the japanese fonts claims to provide some characters in the han character range, when a character that is in the han range but NOT provided for by the japanese font, the chinese font will be ignored.

Even if Qt did do replacement by character, the output would still be flawed because some of the characters will be coming from the japanese font, and some from the chinese, and although they might be logically the same character, in the simple case its like mixing verdana with arial mid-word, and in the worst case, since in chi/jap/kor the same character can be written differently, it can be like having random webdings mixed into your western text.

You could claim that its Qt's problem, but as long as qt is based on unicode, funamental problems will exist, and so they should be handled at the KDE level. Standardizing on unicode is great, but it is very important to remember that unicode DOESN'T SOLVE ALL THE I18N PROBLEMS.

I would say KDE really needs its own font picking scheme. The default kde control center dialog should let you specify per-language font preferences, and applications should be given an API to query which fonts a user prefers for a particular language. This would provide a framework to fix the problem once and for all, but adding back a dialog similar to mozilla to the konqueror settings dialog is a necessary start.
Comment 16 Maksim Orlovich 2003-11-07 15:22:22 UTC
*** Bug 62804 has been marked as a duplicate of this bug. ***
Comment 17 Maksim Orlovich 2003-11-07 15:25:10 UTC
*** Bug 56338 has been marked as a duplicate of this bug. ***
Comment 18 Maksim Orlovich 2003-11-07 15:41:33 UTC
Let me add an another case in which the automatic substitution fails: 

I use an English locale, with Vera Sans as my default font. I also have MS fonts installed. As I am also a Russian speaker, I occassionally read Russian-language pages. On them, as Vera does not have Cyrillic coverage, Qt tries to find a similar font to substitute. But since a lot of fonts have basically the same properties, it just chooses whichever one is first in the alphabet. So common but special-purpose (i.e. hard to read) fonts like Arial Black or Arial Narrow are often selected automatically, in place of something more readable like Arial Unicode. Worse, there are some fonts for Asian scripts that have Cyrillic coverage, are numeric in name -- so they're chosen before even Arial variants --- and are unhinted and utterly unreadable. IMHO, before we only have fonts that provide glyph coverage only for areas where they have high-quality glyps, and which make their 'decorative' quality obvious to Qt's font-matching code, we need per-script configuration, as automatic font-matching simply doesn't have the conditions to work acceptably.

Comment 19 Maksim Orlovich 2003-11-07 15:44:23 UTC
Re-titling to be more accurate
Comment 20 Ken Deeter 2003-11-07 22:17:05 UTC
Maksim, have you tried manually specifying a replacement font using qtconfig? Your post #18 seems like a slightly different problem. You can make qt prefer certain fonts over others. What I was talking about is situations where even if you do set up a preference scheme, it still won't work.
Comment 21 Nicolas Goutte 2004-04-02 00:29:25 UTC
You can change/influence the font substitutions in qtconfig .

Have a nice day!
Comment 22 Nan Zou 2004-04-02 02:23:40 UTC
As previous comments have indicated, Qt's font substitution is not sufficient in many cases.  I have run into this problem many times browsing Chinese language web sites.  Some parts of the page are rendered corrected using the substituted simplified Chinese font, but other parts of the page are rendered using a traditional Chinese font, but that font has many missing glyphs for certain simplified Chinese character code points, resulting in a page filled with "holes", ki d of l ke t is.

This is the problem with having Qt picking the "right" font based on the Unicode character ranges instead of giving user the control over the appropriate font for a given script.

Another problem is when editing a Chinese-encoded text file (or any other text files with characters in the UniHan range) in kate, font substitution only happens for certain characters, I assume characters that the default kate text font (Vera Sans Mono in my case) does not have a glyph for.  For example the Unicode character opening quote (U+201C) is rendered using the default font, while the UniHan characters are rendered using MS Simsun font.  However the glyph in Vera Sans Mono is half width while it's a full width chacrater in SimSun, this results in a "jagged" display of lines whereas if all characters are rendered using the Simsun font the lines will line up perfectly.

I much agree with the suggestion of Ken Deeter in comment #15, give user the ability to define font preference for different scripts, and have the application choose the right font based on the script used.  Auto-detection needs to be improved as well.
Comment 23 James Richard Tyrer 2004-04-02 07:28:59 UTC
IIUC, what is needed is something similar to the current release of Mozilla:

Edit -> Preferences -> Appearance -> Fonts

This is a more user friendly system than KDE used to have for Konqueror which was based on 8 bit character encoding codes.  It bases the font selection on names for encodings.  I would suggest that we simply borrow the Mozilla widget.  The underlying code might have to be slightly different because Qt/KDE uses UNICode.  But the same concepts of names for UNICode pages (or groups of pages in the case of ideograms) applies to UNICode.

Font substitution does occur in Qt and there is no way to control it when its occurrence is dependent on which UNICode page the glyph is on.

It appears that this needs to be for all of KDE, not just Konqueror.  As explained in comment #22, this feature is needed not because UNICode is the problem but rather to correct problems with fonts.  If all fonts had all glyphs, there would be no problem.  But, till that happens, we need this feature.

--
Comment 24 Ken Deeter 2004-04-02 09:26:35 UTC
its actually not the 'names of the encodings' but rather 'scripts', as some languages (such as japanese) can have multiple encodings (SJIS, EUC-JP, iso-2022-jp). Also, a unicode font can't really have 'all' glyphs IIUC. This is because of the "han unification" which maps traditional chinese/simplified chinese/japanese kanji all to the same code points, making it impossible for one font to provide several or all variations.

As far as I can tell, there are several ways to solve this. As the previous comment says, and I agree, that the solution needs to be provided for all KDE applications, as this problem is not specific to konqueror or khtml. One solution is to solve it at the Qt level. IMHO Qt needs to use fontconfig/Xft to the fullest, similar to the way that gtk/pango does. IIRC xft/fontconfig has a way to specify the language in a font request, so that it can choose appropriate substitute fonts automatically.

In an app like epiphany, for example, this means that you can set your fonts for chinese and japanese both to "verdana", and fontconfig will choose a Chinese font and a Japanese substitute font appropriately, based on the page you were looking at.

The problem with fontconfig is that specifying which fonts you want it to use for replacement is difficult, because you have to write the XML fonts.conf style file in your home directory.. a solution that isn't so great for end users.

The ideal combination would be for Qt to use fontconfig's replacement scheme isntead of its own and for kde to provide a UI so that users can specify Language->font mapping preferences, perhaps one that could generate the correct fontconfig xml.

Another approach would be for KDE to provide its own font look-up mechanism, but I fear that without proper support from Qt, in the end, it won't work.
Comment 25 James Richard Tyrer 2004-04-02 09:41:07 UTC
Re: comment #24:
<<
its actually not the 'names of the encodings' but rather 'scripts', as some languages (such as Japanese) can have multiple encodings (SJIS, EUC-JP, iso-2022-jp)
>>
Yes, you are correct (except for Chinese).  All that I meant was that it uses names rather than then the numbers, but yes the Mozilla widget does group them by script so that (except for Chinese) you can use it to choose a font independent of which encoding the font uses for that script.  This is much better than having to use the actual encoding numbers and much simpler for scripts that have multiple encodings like Japanese and Cyrillic.  Chinese remains a problem as stated.

--
JRT
Comment 26 jeff pitman 2004-04-02 10:09:01 UTC
Except Chinese?

Try Traditional and Simplified chinese character sets with a slew of encodings with the most popular as Big5 (Traditional) and GB18030 (Simplified).

I have installed KDE 3.2.1 on Redhat 9 with XFree86 4.3.0-2.90.55, which provides improved support for this, but there are remaining problems. The main font replacer mechanism is actually fontconfig, although what has been mentioned about pango/gtk is very important.  Because the same website "http://tw.yahoo.com/" in Mozilla and Konqueror is a stark difference.  Mozilla renders quite well, whereas Konq renders a pile of melting traditional characters on the screen.

I've cleaned up this quite a bit by modifying fontconfig (/etc/fonts/font.conf) to alias taipei, mingliu, PMingLiU fonts to AR PL KaitiM Big5. These fonts appear to cause QT quite the heartburn in rendering madness.  There remains additional problems that I have tracked down to this:

* QT or KDE (probably QT) cannot render properly fonts that have been specified in CSS with a name of the font in the locale dialect.  In other words, the font name isn't just English text anymore, but actually double-byte unicode names (ie. Chinese).

In summary, what I personally want is a Linux system that displays in English, but I can surf to a Chinese website, and *input* *in* *chinese*, without changing LANG, LC_ALL, and bunch of other insanity.  Preferrably, I hope KDE/QT gets something together to iron this out.
Comment 27 James Richard Tyrer 2004-04-03 01:13:56 UTC
This depends on which fonts you choose.

If you use Arial Unicode as the default font in both Konqueror and Mozilla - for Traditional Chinese (Taiwan) -- the page will render well in both browsers.

I say this to emphasize that the problem is configuration.  There is no way to use these Chinese only fonts in Konqueror, but they work correctly with Mozilla.

I also noticed another problem.  Although I have Arial Unicode installed, if I chose Arial MT (which I also have installed) as my: "Standard font", then it misses some of the glyphs -- I get boxes.  This appears to be an additional bug -- why doesn't Qt substitute Arial Unicode for Arial MT?

--
JRT  
Comment 28 Ken Deeter 2004-04-03 01:33:16 UTC
I don't have the Arial Unicode font, but does it contain different glyphs for simplified/traditional/japanese versions of han characters?

When you say 
>>
If you use Arial Unicode as the default font in both Konqueror and Mozilla - for Traditional Chinese (Taiwan) -- the page will render well in both browsers. 
<<

Do Japanese/Simplified Chinese pages look ok? There may be two problems that you are combining.

The first is that Qt's font substitution scheme is not great. It tends to produce empty boxes even when the combination of fonts you have specified does cover all the characters you want to display.

Another is that there is no way to specify per script font settings (which is what this bug is really about). The problem is different from "My font substitution setup doesn't work" -- rather, it means, I want my Japanese page to show up with a Japanese font using Japanese glyps for the Han character range, and I want my Chinese Simplified page to show up with a font containing Chinese simplified glyphs for the Han range.

Just to be clear.. from comment #23
>>
If all fonts had all glyphs, there would be no problem. But, till that happens, we need this feature. 
<<
My understanding that this is actually not correct. Because of Han unification, IIUC, a font cannot possibly provide all variations of glyphs (Traditional/Simplified/Japanese) for the han range. It is up to the application to pick the right font to display the han range.
 

As for comment #26, there is actually an effort to incorporate multi-input method support into Qt. The patch is already done, but we are trying to get Qt to use it. For some reason they don't seem very interested at this point, although it is a MUST HAVE for multilingual usage scenarios. Gtk2 on the other hand has taken on this problem in the form of the immodule and Pango and it's producing pretty reasonable results (IMHO).
Comment 29 Nan Zou 2004-04-03 06:25:55 UTC
Actually Arial Unicode MS (as well as Bitstream Cyberbit) will work as a general substitute font.  According to this page http://www.alanwood.net/unicode/fonts.html#arialunicodems it has 51180 glyphs supporting all major scripts defined in the Unicode standard, which means it will render almost all Web pages with no missing glyphs if the font subsitution happens correctly (but as James indicated this may not always be the case). 

However, this doesn't take away from the fact that a Mozilla-style per-script font selection is still preferable.  One reason is the problem that Ken describes of rendering the same character differently for each script (Simplified Chinese, Traditional Chinese, Japanese kanji and Korean hanja).  Unicode may have unified the character into a single code point, but it says nothing about how the glyph should look like in each script.  That's up to the particular font.  In general the same character should look roughly the same in each script, but there're subtle differences to warrant giving user control over choosing the right font.

I'm also very interested in the multi-input method support in Qt.  I absoultely agree with Ken that this is needed to make KDE a truly multi-lingual environment.  I feel we already have pretty solid fundations in place with Qt based on Unicode and Linux using UTF-8 filenames.  What I want is something similar to the Windows 2000/XP approach, a little applet next to the system tray that lets you select a language and automatically switch to that keyboard layout/input method.  This should be all transparent to the current application.

Now I have to manually set the LANG/LC_ALL/XMODIFIERS environment variable before launching an application, not ideal at all.
Comment 30 jeff pitman 2004-04-03 07:39:00 UTC
Considering it appears that many of us are in consensus that something needs to be done about our little quandary, is there a more appropriate forum to redirect the conversation?  This particular bug has been opened for over two years in hopes that 3.0.x, 3.1.x, and 3.2.1 with the associated upgrades in QT would alleviate our concerns.  However, little boxes continue to appear and in some instances little dots where the Unicode rendering in QT is unable to make proper translation of characters.  

Heavily configuring qtconfig and fontconfig to patch up these problems seems to be a little absurd and does not work 100%.  The disturbing thing is that gtk/pango and mozilla does work, tweaking fontconfig or not.  I know that the comparison is difficult for free software people to swallow; I am a free software developer on several projects myself.  The fact of the matter is that there is a problem, that it might not be Konqueror, that in order to make continued in-roads with KDE into Asia (China, Hong Kong, Taiwan, Japan, Korea), a KDE installation needs to come working out of the box instead of requesting users to spend countless hours of poking and prodding at random fonts and configurations.

I have many case examples here in Taiwan where the adoption rate of Linux on the desktop has slowed because of this very problem.  In fact, it's caused many random Linux distro splinters to come off only to die within six to twelve months.  I fear the so-called China/Japan combined distro effort will not see that much success either.

Can we go beyond this bug report and pass the information elsewhere?
Comment 31 Ken Deeter 2004-04-03 10:12:22 UTC
I think ideally we would get Trolltech involved. A lot of this stuff needs to be (and is most conveniently) handled at the toolkit level.

I wonder whether it is a high priority with them. A few months ago when there was some news about what would be in Qt4, there was no mention of better i18n stuff, though I have seen some improvement with things like QMultilineEdit in the recent Qt versions.

It's easy to ask TT for help, but if there's not much incentive for them (as in the current feature set is enough for their customers) then it's hard to expect them to address the problem fully, although maybe some pressure from KDE could help a little.
Comment 32 jeff pitman 2004-04-25 15:55:58 UTC
Some glyphs from standard Chinese character sets are mishandled by Qt unicode and/or font translation facilities:
根據中文字的筆畫分為五大類, 分別是"橫", "豎", "撇", "點", "折"

There should be no dots in the above sentence.  Some of the characters in quotations are showing up as dots. In addition, some characters are misrendered, such as the 7th character from the left where one of the Chinese character radicals are missing.

My default font is Sans in Konqueror, SimHei (from Windows) in qtconfig and it appears that the rendering system cuts in when it detects another encoding and replaces a related font.

This rendering problem goes away entirely, if you are able to choose the exact Chinese font in Konqueror.  Konqueror will then render correctly.  But, unfortunately, the Chinese fonts that look good in Chinese, do not look that great for Western scripts. For example, viewing kde.org with SimHei, Arphic, et al. is not the most optimal font.  But, they are essential for viewing Chinese website (ie. tw.yahoo.com, www.hinet.net)

I'm gonna post an attachment with an example.  I'll continue to play with qtconfig some more, but I really hope that this issue receives a bit more attention.
Comment 33 jeff pitman 2004-04-25 16:07:27 UTC
Created attachment 5778 [details]
Comparing Konqi and Mozilla

This attachment shows the explained behavior from Comment #32.	The background
is Mozilla rendering using its per-script rendering facilities. The foreground
is Konqueror rendering using its pseudo-smart automagic rendering replacer
engine, which we all agree has some issues.

Maybe, if Konqueror cannot do per-script font configuration, how can we tap
into the magic of the "pseudo-smart automagic rendering replacer engine" in a
way that would allow Konqi to render more intelligently?

Food for thought: Konqi and Mozilla (i think it's using gtk/pango) are running
under the same X (4.3.0), ft2 (2.1.4).
Comment 34 Ken Deeter 2004-04-25 21:37:57 UTC
Everyone please vote this one up if you can. We need to draw attention to the remaining i18n bugs in kde/qt.
Comment 35 jeff pitman 2004-05-03 02:12:43 UTC
Created attachment 5859 [details]
Not only konqi, but this is a system-wide problem.

Most applications that do not use a specified font, will end up breaking the
output in one way or the other.
Comment 36 James Richard Tyrer 2004-05-03 04:32:44 UTC
First, I note that through discussing this issue, that I have educated myself on the subject and now know that UniCode will not solve this problem -- in fact, it is part of the problem.  So, I suggest that if developers do not understand why this is needed that they also do the necessary reading to educate themselves.

Yes.  Due to the fact that Unicode is not perfect, and the fact that Han equivalences tend to screwup CJK ideographic fonts, Qt needs a system similar to the one which Mozilla uses -- not just for the Konqueror (as part of the browser) but a part of: "qtconfig" (Which IIUC also has, or would have, a KDE frontend).  We also need this because the automatic substitution for non-CJK fonts is sometimes not as good as could be achieved with some configuration control -- all fonts do not have all glyphs and this situation needs to be better addressed.

We need to be able to configure the: "Default Proportional" font and: "Serif",  "Sans-serif", "Cursive", "Fantasy" & "Monospace" fonts for each script -- as listed in the Mozilla Fonts dialog (including three kinds of Chinese).  This will meet the needs of the web browser.  If these pseudo-fonts are available as additional choices for font choices whereever fonts can be chosen, this would appear to solve the problem -- but there is still an issue.  How is it going to be determined which font substitution to use.  There is no problem with Konqueror in many cases (currently almost all cases) since the web page should include the encoding.  But what happens if it uses UniCode?  This is going to take some thought to figure out exactly what we want to happen and how to accomplish it -- that is how the user interface should operate (I don't expect non-engineers to figure out the code).  Specifically, you are looking at something in CJK that is encoded in UniCode UTF8.  IIUC, there is no way to know whether to use Chinese, Japanese, or Korean and if Chinese which of the three to use.  Mozilla lists Simplified and TWO kinds of traditional.  It is possible that using the Locale setting could determine which of the five CJK (3*C + J + K) to use, but what if you are multilingual or for whatever reason do not want to use your Locale's glyphs for the Han ideograms?  This is clearly not a simple problem.

A larger issue is Qt's font substitution.  In applications that allow multiple font choices like KOffice, is the automatic font substitution a good thing?  Should it be possible for an application to turn it off?  And if such an application does not turn it off, should be be possible to determine which glyphs are being substituted and from which font?

This leaves the question of size adjustment unanswered.  This can be addressed in the browser by having the default size for Proportional and Monospaced for each of the scripts.

Perhaps we need to implement some of this first to see how the size issues works in other applications.

Clearly, we have reached the point where we have defined the problem -- people are getting the wrong fonts with automatic font substitution and we know that the traditional way of doing this in a web browser appears to work.  We now need to start thinking about the solution (do the engineering design work) and we will probably find that this is much more difficult.

--
JRT
Comment 37 Ken Deeter 2004-05-03 06:44:50 UTC
I wish there was a kde-i18n list or similar where we could discuss things like this. The i18n related messages on devel and core-devel tend to get lost.
Comment 38 Stian Haklev 2004-05-22 21:17:49 UTC
I was very happy to find this bug (by random, doing something else), finding that I was not the only one. Thus, I removed my previous Bug 79814. I completely agree with the above that KDE needs per script font selection and a built in extendable IME system. The point is not only to have font substitution work well, but also what you want - my system has many Chinese fonts, and the one I prefer for reading webpages cannot be determined by "which Chinese font looks the most like the one he selected for primary latin-based font". Also, the existing XIMs for Chinese are often very hard to configure (or at least it can easily happen that you can't get it work - I still haven't gotten Chinese input to work on my machine, even following all the documentation), and is obviously _obligatory_ for people in concerned countries. In fact I was thinking loosely about switching my girlfriends mother over to Linux, she only does chatting, emailing and webbrowsing, and downloads tons of viruses -- but before Chinese support in all KDE applications is absolutely fail-free (ie you open a unicode Chinese text file in Kwrite, and it displays in Chinese without you having to do anything etc), it will have to wait. This is one area where we are far behind Microsoft, and it doesn't suit us!
Comment 39 Datschge 2004-05-24 00:24:45 UTC
*** Bug 79814 has been marked as a duplicate of this bug. ***
Comment 40 James Richard Tyrer 2004-06-21 00:18:22 UTC
About implementation:

GNOME (I don't use GNOME, but I do have GNUMeric and The GIMP installed) appears to have three psudo-fonts:

	Monospace
	Sans
	Serif

These appear to default to the corresponding BitStream Vera fonts.

It appears to me that if we could have some configuration option that would:

1.  Allow users to add additional pseudo-font names.

2.  Allow users to choose which fonts to use for these pseudo-font names for the various scripts (as listed in Mozilla).

That this would solve the problem.

Would it be possible for FontConfig to do this?  That is, if it has the necessary stuff added to the: "/etc/fonts/local.conf"?  It appears that the issue is that Qt is going to be starting with the UniCode character code for a glyph because these pseudo-fonts are going to have to be specified as UniCode.  So, something is going to have to intercept this call and substitute the desired font name and encoding for the glyph.  And, this is going to have to be on a glyph by glyph basis since if you mix Roman and Japanese you might have specified a different font for each of them (that is, after all, the purpose of this).

Perhaps we should look into how GNOME does this and expand on that so that it could be a FreeDeskTop standard.

Changing the component to general becasue this would be part of KDE.

--
JRT

	
Comment 41 James Richard Tyrer 2004-06-21 01:46:39 UTC
Addition to comment #40

According to FontConfig, the alias name: "sans" is deprecated and the correct one should be: "sans-serif".  Strange, GNUMeric still uses: "sans".

The FontConfig seems to indicate that it *can* do this.  If refers to: 

charset   FC_CHARSET        CharSet                    Unicode chars encoded by the font

But there is nothing in the documentation to indicate how it works. :-(

--
JRT
Comment 42 Ken Deeter 2004-06-21 03:05:16 UTC
My understanding of fontconfig is that there is some way in which one can specify a language when making a font request. The font selection mechanism will take this into account when picking fonts. This lets you say "sans" but have a Japanese font in Japanese situations and Chinese font in Chinese situations.

I think the corresponding to do is to making qt have the same kind of API, where a program can say give me "sans" for language X, and the returned QFont or whatever would encapsulate all the combination of various fonts.

As far as functionality, fontconfig has most everything that is needed, except for a way of listing available alias names (except for serif, sans-serif, and monospace, which are hardcoded), but I think kiethp was mentioning something about changing this.

The problem with using fontconfig, as far as I can see, is the configuration may be "too flexible" as the XML stuff can at times seem like a full programming language. One thing we need to keep in mind is how we can provide a sensible UI for this that doesn't have a bazillion options. The last thing we want is for users to have to write the XML themselves. Maybe one of the mailing lists would be better for discussing this?
Comment 43 jeff pitman 2004-06-21 06:18:49 UTC
On Monday 21 June 2004 09:05, Ken Deeter wrote:
> I think the corresponding to do is to making qt have the same kind of
> API, where a program can say give me "sans" for language X, and the
> returned QFont or whatever would encapsulate all the combination of
> various fonts.

In my experience (running KDE 3.2.3/QT 3.3.2), this is exactly what's 
happening if you set the LANG to the encoding in question.  For 
example, LANG=zh_TW.Big5 with a default font of Sans actually replaces 
it with a Chinese equivalent.

There are two areas of possible font substitutions.  

First, in fontconfig, which you can manipulate in ~/.fonts.conf 
or /etc/fonts/local.conf for system-wide settings.  Here you can setup 
<alias> fonts that pick the best font based on the selected font name.  
This is probably the best place to add all those funky fonts that 
utilize Chinese names to alias them.  I'm not sure if fontconfig can 
handle this.

Second, in qtconfig, which is found in ~/.qt/qtrc.  There is a tagline 
called, suprisingly, [Font Substitutions].  This can also be modified 
in the "Fonts" tab inside qtconfig.  However, beware that the "Fonts" 
tab is a poorly designed layout so it is very confusing to use.

The above font substitution is a primitive one-to-one mapping without 
regard to any detected encoding system.

Although we have a very good foundation to perform good, consistent 
rendering of CJK fonts, there's absolutely tons of room for 
improvement.  Why hasn't there been much discussion about this before?  
Well, I think the audience that has wanted this has been a small 
portion of the user base.  Really, the audience for this bug request 
are those people who are multi-lingual, running the system in their 
Native Language (eg. LANG=en_US), and would like a consistent way of 
using other languages, such as Chinese in my case.

One possibility is to put the above request not in Konqueror, but 
actually in the Fonts tab of qtconfig (and, of course, Control Center). 
This will enforce a system-wide font substitution configuration based 
on the encoding auto-detection facilities already extant within the Qt 
library.  I'm not sure if this needs to be pushed all the way down to 
Fontconfig or not, because, I hate to be redundant: substitution works 
well and consistently in Mozilla, GNOME (Gtk/Pango, etc.).

> Maybe one of the mailing lists would be better for discussing this?

Yes, this is a good question.  I'd hate to start from square one on each 
list as we poke each one to figure out which one needs to be involved.  
It is most likely well-worth it to commit the cardinal cross-post sin 
on this one and see which mailing list it finally converges to; and, 
I'm not just talking about those within the kde community.  We're 
probably talking about the following lists:

kde-core-devel list
	Some trolls lurk on this, so they'd be in the know or could communicate 
direct to Qt about these issues.

http://www.qtforum.org/
	From a pure Qt app standpoint, this issue needs resolving too.

openi18n.org's openi18n-sa, openi18n-im lists
	What integrations need to happen at the LANG (locale) level to make 
multi-lingua useful.

freedesktop.org's fontconfig list
	It may be useful to make them aware of the issues.

Personally, I am willing to make a formal request out to each of these 
parties to see what action can be taken to make this issue better. I do 
okay on formulating emails, since the politics at my job dictates that 
I should. :P  Before engaging a bunch of people though, it's really 
necessary that we first list everyone involved, what their role is, if 
the issue has been brought up before, what have they done to resolve 
it, and what are they working on now to resolve it.

If any brave soul is willing to Google for a few hours, with a focus on 
the above lists, then that would be a boon to communicating this to the 
right people.  Again, I am willing to write up a couple of emails based 
on that information; but, I hope that some of you might gather it in on 
this bug note first before spewing out a response.

What do you think?

Comment 44 Ken Deeter 2004-06-21 07:10:27 UTC
> In my experience (running KDE 3.2.3/QT 3.3.2), this is exactly what's 
> happening if you set the LANG to the encoding in question.  For 
> example, LANG=zh_TW.Big5 with a default font of Sans actually replaces
> 
> it with a Chinese equivalent.
>
 
> There are two areas of possible font substitutions.  
> 
> First, in fontconfig, which you can manipulate in ~/.fonts.conf 
> or /etc/fonts/local.conf for system-wide settings.  Here you can setup
> 
> <alias> fonts that pick the best font based on the selected font name.
>  
> This is probably the best place to add all those funky fonts that 
> utilize Chinese names to alias them.  I'm not sure if fontconfig can 
> handle this.
>

There are actually two problems. We want to have a system that works out
of the box.. for this it is the distributor's job to do the configuring
so requiring them to do the fontconfig XML is not much of a problem.

For the user however, I think saying "edit your fonts.conf" is
unacceptable. We need a UI, and to have this UI, we need to think about
what we should have in that UI, and whether the fontconfig system can be
adapted to provide the required backend functionality.


Also, I should note that simply operating absed on the current locale
does not work for a multi-lingual application. Take the case of Mozilla,
if you have it set up to use sans for both Japanese and Chinese pages,
fontconfig will choose the right font and the page will be displayed in
the right font, even though they are both set to sans. What this means
is that the application has to be aware of the language, and tell the
font system about this information. Its the really unfortunate side
effect of the Han unification scheme. An application can't just say 'oh
its unicode, so it covers everything' (I like to call these
things'language contexts' but I'm not sure if there is a more common
term. An IRC application may have different languages for each channel,
for example, and these would be different language contexts. Fonts
should be chosen differently for each channel).

 
> Second, in qtconfig, which is found in ~/.qt/qtrc.  There is a tagline
> 
> called, suprisingly, [Font Substitutions].  This can also be modified 
> in the "Fonts" tab inside qtconfig.  However, beware that the "Fonts" 
> tab is a poorly designed layout so it is very confusing to use.
> 
> The above font substitution is a primitive one-to-one mapping without 
> regard to any detected encoding system.
>

There are many many problems with the current Qt scheme.

1) Substitution happens over ranges, and is not per character. So for
example, Bitstream Vera Sans is subsituted by a Japanese font then a
Chinese font, the han characters in the chinese font but not in the
japanese one will not get displayed, because qt sees teh japaense font
as covering the han range and therefore never goes beyond that.

2) There is no way to say "for all fonts that don't provide chars for
this script, use this font".. many times I want to say, "use kochi
gothic (jap font) as a fall back for all fonts". There's no way to
accomplish this w/o going thru each font and adding it as a 

 
> Although we have a very good foundation to perform good, consistent 
> rendering of CJK fonts, there's absolutely tons of room for 
> improvement.  Why hasn't there been much discussion about this before?
>  
> Well, I think the audience that has wanted this has been a small 
> portion of the user base.  Really, the audience for this bug request 
> are those people who are multi-lingual, running the system in their 
> Native Language (eg. LANG=en_US), and would like a consistent way of 
> using other languages, such as Chinese in my case.
>

There is also a chicken-and-egg problem. Without good language support,
many users are discouraged to use the system. They don't get so far as
to filing bugs or voting for bugs. Qt/KDE has always had font rendering
problems as far as Japanese user's are concerned, and the only solution
has been to just set a Japanese font for everything, as the substitution
doesn't work too well. There have been patches for qt floating around,
but they tend to be language specific... so they are not known outside
of the communities from which they originated. And they get lost when
the qt version upgrades..

 
> One possibility is to put the above request not in Konqueror, but 
> actually in the Fonts tab of qtconfig (and, of course, Control
> Center). This will enforce a system-wide font substitution
> configuration based on the encoding auto-detection facilities already
> extant within the Qt library.  I'm not sure if this needs to be pushed
> all the way down to Fontconfig or not, because, I hate to be
> redundant: substitution works well and consistently in Mozilla, GNOME
> (Gtk/Pango, etc.).
> 

I agree with the general principle that we do not want to re-invent the
wheel w/ respect to the functionality that fontconfig provides. However,
we have to think of a layered scenario, and we also always have to
remember that the user may have a preference.

From the thinking that I've done so far we need several levels of
configuration.

1) App-level configuration. So konqueror can look however the user
wants, w/o affecting the rest of the system. Word procesors or any other
text-based application will probably need this.

2) System-level configuration. This would correspond to the font dialog
in the control center right now. This affects all apps unless they
choose to have their own preferences.

3) Default fontconfig fallback configuration.

IMHO 1 and 2 need a UI. 3 can be editing text files. I think the key
approach as far as KDE is concerend is to provide some kind of software
component that can be re-used for 1 and 2, since the general idea is the
same.

We also have to remember that configuring substitution fonts is actually
a separate 'dimension' of font configuration (just as font substitution
configuration and font selection are done in different places in kde
right now)


> If any brave soul is willing to Google for a few hours, with a focus
> on the above lists, then that would be a boon to communicating this to
> the right people.  Again, I am willing to write up a couple of emails
> based on that information; but, I hope that some of you might gather
> it in on this bug note first before spewing out a response.
> 

Not sure if I have the time, but the fontconfig list does sound good to
me. It also seems like core-devel would be a good place, since this is
far reaching issue.

THe other thing to note is that qt4 is coming out with a new text
rendering architecture that might address some of these problems.
Noone's seen the code so far so its speculation for now, but since the
tech preview is due out in a few weeks, it might be worth it to wait and
see what they do. 

Comment 45 James Richard Tyrer 2004-06-21 21:01:52 UTC
I would suggest that preliminary discussions be on: kde-quality

The current documentation for FontConfig:

http://pdx.freedesktop.org/~fontconfig/fontconfig-user.html

states that it will handle the properties: 

  charset         CharSet Unicode chars encoded by the font
  lang            String  List of RFC-3066-style languages this font supports

I still have not been able to find anything that explains: "charset" but the: "lang" property is fairly obvious if you look at a cache file.  It appears that using: "lang" that FontConfig could operate as a backend to select the fonts using: "lang" to separate Japanese and Chinese.  I found a site that has a list of these codes:

http://www.w3schools.com/vbscript/func_getlocale.asp

However, to use these in a back end, the front end (Qt) would have to supply this information and I don't think that it currently would do that -- if it did anything, it would just supply the locale code from the users environment.  

Web pages should have this code, and there needs to be some way to use it to select the correct font -- something in the front end has to pass this to FontConfig.

OTOH, the property: "charset" would appear to be generated by FontConfig (I am guessing) based on the range (or page) in UniCode.  I am presuming that Qt does call FontConfig with a 16 bit character code.  But, this will not distinguish between Japanese and Chinese (or between different types of Chinese).  So, this is why I said that it needs to be possible to add additional alias font names.  If the user needs both Japanese and Chinese (s)he is going to have to have an alias: "sans-serif-jp" and an alias: "sans-serif-zh".  I see no other way to do this since the software can not be psychic.

--
JRT
Comment 46 Rick Graves 2005-02-10 14:22:09 UTC
KDE 3 broke Chinese.

I started with a test bench PC and a blank hard drive.  I installed CentOS 3 (like RHEL 3), and while doing so I installed additional languages, both traditional and simplified Chinese.  I set the session to KDE, went in, and took a screenshot of Konqueror:

http://www.advanced-app.com.hk/MiscJunk/KonquererB4.png

(Sorry for the name misspelling in my file name.)

Then I upgraded KDE, went back in, and took a screenshot of Konqueror:

http://www.advanced-app.com.hk/MiscJunk/KonquererAfter.png

KDE 3 broke Chinese.

I need Chinese.  As a work around, I can go into Gnome, or use a Windows box.  
Comment 47 Nan Zou 2005-02-10 17:39:40 UTC
Created attachment 9540 [details]
Konqueror 3.3.1 display Chinese page

Re: #46

Looks like a font substitution problem, it couldn't find the correct glyphs
from the current font.	Check the font substitution section in your ~/.qt/qtrc,
here's the mine:

[Font Substitutions]
Bitstream Vera Sans=SimSun^ePMingLiU^e
Bitstream Vera Sans Mono=SimSun^eMingLiU^e
LucidaTypewriter=SimSun^e
Verdana=SimSun^ePMingLiU^e
arial=helvetica^eSimSun^ePMingLiU^e
helv=helvetica^e
tms rmn=times^e

That same page looks fine on my KDE 3.3.1.
Comment 48 James Richard Tyrer 2005-02-10 21:16:41 UTC
Created attachment 9543 [details]
Screen shot Google-HK

Re: Comment #46

There are problems, but this isn't the cause of your problem.

The attached screen shot uses Arial as the "Standard font" in Konqueror.

-- 
JRT
Comment 49 Rick Graves 2005-02-10 21:33:40 UTC
Hello Nan Zou,

The change you suggested for ~/.qt/qtrc helped a lot
-- I can see Chinese characters on the HK Google page
again.

One small problem remains, however -- Chinese
characters do not display on the English HK Google
page.

Thanks,

Rick



--- Nan Zou <nzou@lanl.gov> wrote:

> ------- You are receiving this mail because: -------
> You are a voter for the bug, or are watching someone
> who is.
>          
> http://bugs.kde.org/show_bug.cgi?id=39185         
> 
> 
> 
> 
> ------- Additional Comments From nzou lanl gov 
> 2005-02-10 17:39 -------
> Created an attachment (id=9540)
>  -->
>
(http://bugs.kde.org/attachment.cgi?id=9540&action=view)
> Konqueror 3.3.1 display Chinese page
> 
> Re: #46
> 
> Looks like a font substitution problem, it couldn't
> find the correct glyphs
> from the current font.	Check the font substitution
> section in your ~/.qt/qtrc,
> here's the mine:
> 
> [Font Substitutions]
> Bitstream Vera Sans=SimSun^ePMingLiU^e
> Bitstream Vera Sans Mono=SimSun^eMingLiU^e
> LucidaTypewriter=SimSun^e
> Verdana=SimSun^ePMingLiU^e
> arial=helvetica^eSimSun^ePMingLiU^e
> helv=helvetica^e
> tms rmn=times^e
> 
> That same page looks fine on my KDE 3.3.1.
> 

Comment 50 Rick Graves 2005-02-11 08:27:18 UTC
Hello again Nan Zou,

There must be more to it than just updating the qtrc
file.  As I wrote before, it worked for me.  That was
on my home computer.  Here at the office, it has not
helped.  If anything, it hurt -- not only does Chinese
not display on the Google page, but the proportional
fonts look terrible.  

So here I still have a problem.

Thanks,

Rick


--- Nan Zou <nzou@lanl.gov> wrote:

> ------- You are receiving this mail because: -------
> You are a voter for the bug, or are watching someone
> who is.
>          
> http://bugs.kde.org/show_bug.cgi?id=39185         
> 
> 
> 
> 
> ------- Additional Comments From nzou lanl gov 
> 2005-02-10 17:39 -------
> Created an attachment (id=9540)
>  -->
>
(http://bugs.kde.org/attachment.cgi?id=9540&action=view)
> Konqueror 3.3.1 display Chinese page
> 
> Re: #46
> 
> Looks like a font substitution problem, it couldn't
> find the correct glyphs
> from the current font.	Check the font substitution
> section in your ~/.qt/qtrc,
> here's the mine:
> 
> [Font Substitutions]
> Bitstream Vera Sans=SimSun^ePMingLiU^e
> Bitstream Vera Sans Mono=SimSun^eMingLiU^e
> LucidaTypewriter=SimSun^e
> Verdana=SimSun^ePMingLiU^e
> arial=helvetica^eSimSun^ePMingLiU^e
> helv=helvetica^e
> tms rmn=times^e
> 
> That same page looks fine on my KDE 3.3.1.
> 

Comment 51 Nan Zou 2005-02-11 17:30:25 UTC
Rick,
    Without knowing the details of your system configuration, I can't offer much help.  Email me privately with your setup information if you want to pursue this further.
Comment 52 Rick Graves 2005-03-03 10:08:35 UTC
Hey,

With Nan Zou's help, I was able to put Chinese back on my computers.  

I offer this information in the hope that it will help KDE better handle multiple languages in the future.  

From a Windows computer that can display Chinese, I grabbed all the *.ttc fonts:

batang.ttc
gulim.ttc
mingliu.ttc
msmincho.ttc
simsun.ttc

I put them in /usr/X11R6/lib/X11/fonts/TTF.  

I added this line to fonts.conf in \etc\fonts:

<dir>/usr/X11R6/lib/X11/fonts/TTF/</dir>

Voila! Chinese was back.

But fonts.conf says at the top:

DO NOT EDIT THIS FILE.
IT WILL BE REPLACED WHEN FONTCONFIG IS UPDATED.
LOCAL CHANGES BELONG IN 'local.conf'.

So I put this line in local.conf:

  <dir>/usr/X11R6/lib/X11/fonts/TTF/</dir>

There were already other directory lines in fonts.conf, but there were not any in local.conf before I added the one above.

I hope it keeps working.

My apologies if my problem and the workaround do not fix this bug exactly.  Somehow, I got into this one when I had a problem.

Rick
Comment 53 Maksim Orlovich 2005-06-27 15:45:30 UTC
*** Bug 108201 has been marked as a duplicate of this bug. ***
Comment 54 Mohd Asif Ali Rizwaan 2005-07-01 07:47:56 UTC
Manual Override: I-Robot anyone?

Even in 2035 Cars have manual-override, Will Smith would have died if the car did not allow manual override ;) But KDE is not having manual override for locale fonts, which is killing KDE unicode users by Qt-NS5 ugly fonts ;)

I just can't have many Hindi fonts (decorative, etc. installed) because QT chooses what it deems best for me! that's outrageous. And testing new hindi fonts is also troublesome (i've to uninstall all other fonts and restart kde).
Comment 55 Youssef Chahibi 2006-07-03 02:30:19 UTC
GTK does it, why wouldn't QT. This problem is a general QT problem.
Comment 56 Carl Tzeng 2007-03-04 07:00:06 UTC
Substitude fonts in qtconfig can't work well in my system.

I have to use traditional Chinese and Japanese in the same time.

I configured Chinese fonts before Japanese fonts in the alias list of fonts.conf. Then I set Japanese fonts to substitude sans-serif, serif, and my Chineese fonts.

Somthing strange had happened, my traditional Chinese character are displayed quite well, the Japanese characters such as 涙,桜 are diplayed by a square in the texts which have both Chinese and Japanese.

The Japanese characters I typed by gcin(ps.a input method) are displayed by a square too.

Then I configured Japanese fonts before Chinese fonts in fonts.conf. I set Chinese fonts to substitude sans-serif, serif, and my Japanese fonts.

My Japanese character are displayed quite well, while some traditional Chinese characters, are diplayed by a square in the texts which have both Chinese and Japanese.


To sum up, the qtconfig seems to work on only one substitute font.
Comment 57 Carl Tzeng 2007-03-04 07:09:12 UTC
Will this problem be solved in KDE4? I don't want to change my envirenment to GNOME at all. This problem has exisisted for about five years. I hope this won't existed for next five years.
Comment 58 Vincent Petry 2008-06-25 16:43:28 UTC
This bug has been opened in 2002 and now we are in 2008 a nd it's still "NEW". Has anything been done until now? If it has been fixed in qt4, please close the bug and refer to it. Thanks.
Comment 59 James Richard Tyrer 2008-06-25 23:42:29 UTC
This problem isn't going to be fixed in Qt because TrollTec thinks that Unicode is the answer to such issues.

1.  For the Generic font names, the problem needs to be handled in FontConfig and the solution would be a KDE applet to set this up.  You need to be able to choose a default font for each of the 5 generic font names and then for a selected encoding, or Unicode page, choose an override.

2.  For HTML (etc.), the answere would appear to be what FireFox uses.  You can choose a default font for three of the generic fonts for each of the listed languages *and* it allows you to choose overriding the fonts specified for these in the HTML code.

3.  For KDE: where you choose a font, you need to also be able to choose an overrid for a selected encoding or Unicode page.

The problem is that this doesn't always solve the problem.  The reason is that the encoding, or unicode page, does not always determine the language being used.  Specifically, Japanese, Korean, 3 types of Chinese (and then there is  Canadian [yes Canadian]) can not always be determined by the encoding, or Unicode page. 

HTML pages contain a locale code, so that should solve the problem there.  But, how do we solve it for other cases?  The above would appear to be a great improvement, but there may still be problems.  Specifically with CJK languages there is the Han unification issue -- these characters have the same Unicode character code and the same meaning but the actual glyphs are going to be different in Japanese, Korean, and the 3 types of Chinese.

Wordprocessing documents could contain locale codes.  Documents that are text/plain do not even contain an encoding and (therefore) requires the user to tell the system what the character encoding is.  Email appears to still be based on character encoding although it could contain a locale code.

It was claimed that GTK (or actually Pango) fixes this, how does it do this?

So, IMHO, this is still an issue that can be partially resolved but still needs some additional work to totally resolve it.
Comment 60 Abel Cheung 2008-06-29 20:58:30 UTC
In reply to comment #59:

> It was claimed that GTK (or actually Pango) fixes this, how does it do this?

No, this is not true. The situation is, Behdad used a "mathematical" proof to prove that the bug is completely impossible to solve. His offered choices are:

1. set system locale to non-english one (irrelevant to this bug)
2. set $PANGO_LANGUAGE variable to force single locale within pango (and pango only)
3. Remove all crappy fonts picked up by fontconfig 
4. fuck off, bastard
Comment 61 Kjang Kwreuug-Kuq 2008-07-05 10:34:48 UTC
@James Richard Tyrer:
  You said, "Specifically with CJK languages there is the Han unification issue -- these characters have the same Unicode character code and the same meaning but the actual glyphs are going to be different in Japanese, Korean, and the 3 types of Chinese." I don't agree with this. AFAIN, if two words is in the same shape, even they have different pronounciations, different meanings, they will be encoded as one. I don't know why characters' display should be bound up with locale as a code means a shape. 

@Abel Cheung:
  I will have a try. What I want to say is that at least pango give a partial solution. Are there any users pay attention to $PANGO_LANGUAGE? If everything is okay, nobody will touch LC_ALL for setting fonts, leaving behind $PANGO_LANGUAGE!

Hope this could be fixed in no soon.
Comment 62 Kjang Kwreuug-Kuq 2008-07-05 10:50:48 UTC
Q: Does the Unified Han character encoding in Unicode mean that I only need one CJK font for Asia, or do I have to allow for choices between different styles of CJK fonts for different countries?
Here(http://www.unicode.org/faq/han_cjk.html#0), Unicode said:
Q: Does the Unified Han character encoding in Unicode mean that I only need one CJK font for Asia, or do I have to allow for choices between different styles of CJK fonts for different countries?
A: Broadly speaking, there are four traditions for character shapes in East Asia: traditional Chinese (used primarily in Taiwan, Hong Kong, and overseas Chinese communities), simplified Chinese (used primarily in mainland China and Singapore), Japanese, and Korean. Using a single font for all four locales allows the characters to be legible, but means that some characters may look odd. For optimal results a system localized for use in Japan, for example, should use a font designed explicitly for use with Japanese, rather than a generic Unihan font.

User may have their locale set, and this might help. But Unicode think using one font should be also legible. I think that we don't care so much about if user did something or break something. If all are legible, no users will do that. Pango got it.
Comment 63 James Richard Tyrer 2008-07-05 13:09:53 UTC
IIRC, the largest differences are with Japanese.  Some of the Han characters in a Japanese font are the same or similar to those in a Chinese font, but some of them are completely different.  So, if you have both Chinese and Japanese in the same document, it is a real problem.

So, that is one small problem that being able to choose an override font for the languages listed in Firefox isn't going to fix.  But, that would solve most of it.  This still wouldn't mean that it would look for missing glyphs in other fonts.  I don't know if you can set that up with FontConfig or not.

The other question is exactly how doe Pango determine exactly CJK language is being used?
Comment 64 Cheng Chia, Tseng 2008-07-18 16:02:52 UTC
As far as I am concerned, if you are using zh_TW.UTF-8 locale and there is one font's language tag is zh-tw, fontconfig will just use the font only and neglect whether there are some characters missing in what is going to displayed.

However, Pango can search for the missing characters and patch them. So, if your system has Traditional Chinese, Simplied Chinese, Japanese, and Koreon fonts, you will probably have no problems displaying them in GTK2-based programs.

So, why can't QT do it either? Or just use the same idea which Pango have been using?
Comment 65 Médéric Boquien 2008-07-18 16:29:06 UTC
The solution is in a way quite simple. Opentype fonts handle local variants according to the language. So to fix it, fonts containing asian glyphs should have the variants for chinese, japanse and korean. According to the language set in the page/locale, it would automatically pick the right one. So basically the problem is not in Qt or KDE which do the right thing but with font makers which do not use the possibilities of opentype.
Comment 66 Cheng Chia, Tseng 2008-07-19 09:00:57 UTC
However, there are no fonts includes all CJK characters! And in fact, that is quite impossible to finish this kind of work. 

We are not talking about the variants of Han-zi characters! We are talking the missing characters in each font set! For example, Tradintional Chinese won't have 涙,桜 which Janpanse ones have. A Traditional Chinese font won't have the characters which a Janpanses font should have, although there are some shared Han-zi characers with diffent variants, such as 角, 骨.

If there is a document including Traditional Chinese, Simplied Chinese, Japanse and Korean Characters, and not the shared Han-zi charaters, a good program should just display all of them well. GTK2-based programs can do well if you have those fonts. However, QT-based programs just display only the ones depends on what your locale is.
Comment 67 David Faure 2008-07-23 13:54:55 UTC
On Friday 18 July 2008, Cheng@ktown.kde.org, Chia@ktown.kde.org, Tseng wrote:
>  Pango can search for the missing characters 


Qt does that as well, AFAIK.
Comment 68 Cheng Chia, Tseng 2008-07-24 09:27:49 UTC
>Qt does that as well, AFAIK. 

So, what's the problem behind QT which makes QT-based programs not to display CJK characters at the same time altough you have those fonts installed?

Thus, I don't think QT does that as well.
Comment 69 David Faure 2008-07-24 18:38:43 UTC
On Thursday 24 July 2008, Cheng@ktown.kde.org, Chia@ktown.kde.org, Tseng wrote:
> So, what's the problem behind QT which makes QT-based programs not to display CJK characters at the same time altough you have those fonts installed?


I don't know. Some bug. Needs debugging (*)
But I distinctly remember Lars Knoll saying Qt (even in Qt3) was able to grab glyphs from other fonts when needed.

(*) which means, digging into Qt or sending a testcase for Trolltech; maybe just an html document for the qtextedit example.
Comment 70 Maksim Orlovich 2008-07-24 19:05:55 UTC
Qt3 certainly did a poor job of it --- it could not pick up stuff like various math symbols, they would just come up as boxes. Qt4 does at least that fine. (Can't comment on CJK).
Comment 71 Carl Tzeng 2008-07-28 09:06:21 UTC
Is this bug (http://trolltech.com/developer/task-tracker/index_html?method=entry&id=165162) in QT4 related to this problem?

If it is so, I wish that won't be just pending there!
Comment 72 Dotan Cohen 2011-07-25 15:48:25 UTC
*** Bug 14496 has been marked as a duplicate of this bug. ***
Comment 73 Dotan Cohen 2011-07-25 15:57:02 UTC
Carl, can you please update that URL to the relevant Qt issue. Thanks!
Comment 74 Christoph Feck 2011-07-25 21:22:31 UTC
As far as I know, it is possible to configure per-language fonts in fonts.conf. We just need a way to (graphically) edit that file. Moving to correct component.
Comment 75 Mohd Asif Ali Rizwaan 2011-08-13 21:39:55 UTC
Well, this is called 9 years of silent treatment by the Developers for the users.

It is a tradition in KDE, first remove a good feature from an application/system and ignore the users' request/demands/needs.

1. in KDE 1.1 - kfm's location bar had Filter feature, which was removed in KDE 2; and after say 10-12 years back in Dolphin (not in konqueor the filemanager);
2. KDE 2 - Font Selection as this report says - is gone; maybe it might come in KDE 5 or 6
3. KDE 3.x - Windows/Meta/Super Key, to launch the menu and assign shortcuts; still not coming... because one genius developer did not like the code of 1 button for multiple role.
4. KDE 4.7 - vertical Back  button from kickoff

and unnecessarily force, confusing stuff like activities, the hated cashew, nepomuck, and fancy named incompetent stuff on the users.

So, History proves that the KDE developers do not care about users' opinion, ease of use, or needs.

KDE 3 started to become THE DESKTOP with great applications, but KDE4 failed; Still buggy, incomplete, inconsiderate, and a DEMO software of QT toolkit.

Obviously, KDE is not democracy where users have any say; it is authoritarianism.

I'm not criticizing but stating the fact which I have been observing for many many years.

I see Gnome 3 series will be the answer to the desktop users. KDE applications simply do not deliver what they promise. But Qt applications which are exceptionally feature rich and competent, do deliver. vlc, smplayer, qt-this, qt-that, qbittorrent, goldendict, skype, etc.. etc...

Now, related to this bug, Qt app MDict, does support multiple fonts for different locales. How can mdict do that? when KDE can't (or don't want)?

http://mdic.gnufolks.org/
Comment 76 Christoph Feck 2011-08-13 22:09:27 UTC
I just downloaded MDict 0.8.1, and I fail to see where/how it supports different fonts for different locales. Can you clarify why you believe it does per-locale font selection?
Comment 77 Christoph Feck 2013-02-11 02:28:46 UTC
*** Bug 314805 has been marked as a duplicate of this bug. ***
Comment 78 Eike Hein 2015-01-18 15:14:48 UTC
I've been researching some ways to address this problem (unfortunately not much progress since the blog due to other work): https://blogs.kde.org/2014/09/11/beyond-unicode-closing-gap-support-mixed-character-set-text-kde-workspaces
Comment 79 Andrey 2020-06-19 12:24:55 UTC
The program in the text box does not display Russian letters, but only Latin and numbers.