Bug 465384 - Support a way to set a preference for CJK character variants (Han unification) in KCM or derive it automatically
Summary: Support a way to set a preference for CJK character variants (Han unification...
Status: CONFIRMED
Alias: None
Product: systemsettings
Classification: Applications
Component: kcm_fonts (show other bugs)
Version: 5.26.5
Platform: Other Other
: NOR wishlist
Target Milestone: ---
Assignee: Plasma Bugs List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-06 18:01 UTC by wazhai
Modified: 2024-10-13 21:18 UTC (History)
8 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
old vs new KCM differences (121.33 KB, image/png)
2023-02-06 18:01 UTC, wazhai
Details
secondary language Chinese (30.10 KB, image/png)
2023-02-06 18:16 UTC, wazhai
Details
fonts.conf (921 bytes, application/xml)
2023-02-07 11:56 UTC, wazhai
Details

Note You need to log in before you can comment on or make changes to this bug.
Description wazhai 2023-02-06 18:01:27 UTC
Created attachment 156001 [details]
old vs new KCM differences

SUMMARY
Right now there doesn't seem to be a way to set the user's preference for CJK characters such as 今 and 置 that are displayed differently across Chinese, Korean and Japanese.

Before the language/format KCM merge, after much trouble, I discovered it was possible to change this using Region in the old KCM (a crude solution). See attachment, tested on Kubuntu 22.04 with Plasma 5.24.

However setting the region isn't available in the new KCM. I'm unable to find a way to switch away from Japanese character variants on Fedora, even when the system's primary language is set to Chinese.

STEPS TO REPRODUCE
1. On a recent Fedora KDE, install with English as the system language (maybe not needed)
2. It displays Japanese variants by default
3. Add Chinese to the language list in an attempt to display Chinese variants and restart session
4. It doesn't work

OBSERVED RESULT
There seems to be no way to set this preference.

EXPECTED RESULT
There should be a way to set unicode CJK preference in the systemsettings GUI. The way this works e.g. on Windows and Android is that if there is Chinese/Japanese in the language list (even as a secondary one below the primary UI language), the system displays that language's variants of affected characters whenever possible.

Is there something KDE can do about this and apply the appropriate setting to the system according to the language list so that this is achieved?

SOFTWARE/OS VERSIONS
Operating System: Fedora Linux 37
KDE Plasma Version: 5.26.5
KDE Frameworks Version: 5.102.0
Qt Version: 5.15.8
Comment 1 wazhai 2023-02-06 18:16:41 UTC
Created attachment 156002 [details]
secondary language Chinese

Apologies, I made a mistake an in the original attachment. Fedora is actually showing the correct variants when it's the primary language.

However the point still stands in case Chinese is below English, the system shows Japanese variants in that case. There is no way in the GUI to specify Chinese variants as the preference.
Comment 2 hanyoung 2023-02-07 04:15:32 UTC
Both old and new KCM only set ~/.config/plasma-localerc, check if there's difference there. And make sure you test after reboot.
Comment 3 wazhai 2023-02-07 05:15:41 UTC
Here are the contents on my English-language Kubuntu system where setting a CJK preference through Region in the old KCM works:
KDE Language list - en_GB only; Formats > Region > ja_JP or zh_CN, etc to set CJK pref.
$ cat .config/plasma-localerc 
[Formats]
LANG=ja_JP.UTF-8
LC_MEASUREMENT=en_IE.UTF-8
LC_MONETARY=en_IE.UTF-8
LC_NUMERIC=en_IE.UTF-8
LC_TIME=Default.UTF-8
[Translations]
LANGUAGE=en_GB

Fedora system where it doesn't work by putting Chinese below English (a reboot doesn't help):
cat ~/.config/plasma-localerc 
[Formats]
LANG=en_GB
LC_ADDRESS=en_IE.UTF-8
LC_MEASUREMENT=en_IE.UTF-8
LC_MONETARY=en_IE.UTF-8
LC_NUMERIC=en_IE.UTF-8
LC_PAGE=en_IE.UTF-8
LC_TELEPHONE=en_IE.UTF-8
LC_TIME=C
[Translations]
LANGUAGE=en_GB:C:zh_CN

This can be considered a feature request that entails determining how to approach this and implementing a way to do it in new KCM better and more clearly than the old KCM.
Comment 4 hanyoung 2023-02-07 05:26:22 UTC
CJK preference is not supported by POSIX nor ICU, what you did in the old KCM is very hacky. By definition, LANG is used to specify the "master" language of the system, and LANGUAGE is used to configure additional fallback language when there is no translation for LANG.

LANGUAGE should include LANG that is. You've set LANG to ja_JP and LANGUAGE to en_GB. That's actually undefined behavior, application can show Japanese or British English. If you want to rely on this behavior, you can edit ~/.config/plasma-localerc yourself.
Comment 5 wazhai 2023-02-07 05:42:45 UTC
"CJK preference is not supported by POSIX nor ICU"

This is a serious issue with Linux localization and if there isn't a good way to do it in the GUI, without diving into command line and system configs, then it should be investigated to see what can be done on the KDE side.

I'm rather disappointed to see this slapped with NOTABUG when it's a feature on e.g. Android and Windows that works as expected using the language list.
Comment 6 hanyoung 2023-02-07 05:47:46 UTC
Changed to UPSTREAM because there is nothing KDE can do. Please read https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/gettext.html#Locale-Environment-Variables
Comment 7 wazhai 2023-02-07 06:23:20 UTC
Fonts can be configured in the font KCM and some like Noto support several variants as evidenced by the old KCM Region "hack" that works fine and doesn't result in non-English text anywhere

Isn't there something KDE can do through the font config like the following? Not sure if this is distro-specific

https://zhmail.com/2019/06/17/ubuntu-18-04-prefer-chinese-fonts-to-japanese-ones/
https://askubuntu.com/questions/901486/%E9%97%A8-looks-weird-on-my-system-default-font
https://askubuntu.com/questions/1268788/how-to-correctly-display-chinese-when-lang-en-us-utf-8
Comment 8 hanyoung 2023-02-07 06:29:55 UTC
(In reply to wazhai from comment #7)
> Fonts can be configured in the font KCM and some like Noto support several
> variants as evidenced by the old KCM Region "hack" that works fine and
> doesn't result in non-English text anywhere
> 
> Isn't there something KDE can do through the font config like the following?
> Not sure if this is distro-specific
> 
> https://zhmail.com/2019/06/17/ubuntu-18-04-prefer-chinese-fonts-to-japanese-
> ones/
> https://askubuntu.com/questions/901486/%E9%97%A8-looks-weird-on-my-system-
> default-font
> https://askubuntu.com/questions/1268788/how-to-correctly-display-chinese-
> when-lang-en-us-utf-8

I believe we can. moved to font KCM. Note that by GNU specification, LANGUAGE has higher precedence than LANG, so set en_GB in LANGUAGE will override LANG=ja_JP. Thus an undefined behavior. Set font family is the way to go
Comment 9 wazhai 2023-02-07 11:56:29 UTC
Created attachment 156028 [details]
fonts.conf

I have finally managed to set a per-user preference for Simplified Chinese variants on Fedora through fontconfig. I copied the exact config from the above Ubuntu instructions while retaining a "proper" locale with only "LANG=en_GB" and no LANGUAGE set.

It's the file  ~/.config/fontconfig/fonts.conf with content from Ubuntu's /etc/fonts/conf.avail/64-language-selector-prefer.conf where Noto CJK SC and TC were manually moved above JP (file attached here)

The goal would be for Plasma users to be able to configure a CJK preference easily somewhere in the settings GUI.

Is it even possible to implement something generic as part of the font KCM that's not necessarily tied to Noto and distro-agnostic?
Comment 10 Asahi Lina 2023-11-20 09:57:38 UTC
It is! I was just running into this and I found the solution.

I have the opposite problem in Fedora (I want Japanese and I'm getting Chinese), and I tracked it down to having Droid Sans Fallback installed (which uses Chinese glyphs), but then the behavior is inconsistent between Qt and other fontconfig apps. If you don't have that font, it just picks a random CJK font, which completely by chance happens to be the JP one. Fontconfig itself will also do that and ignore Droid Sans Fallback due to a language mismatch (it uses different logic to Qt). So really, the default is totally inconsistent and random-ish when there is no Han language preference.

This is the solution:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "urn:fontconfig:fonts.dtd">
<fontconfig>
  <match target="pattern">
    <edit name="lang" mode="append"><string>ja</string></edit>
  </match>
</fontconfig>

What that does is add "ja" to the list of languages being matched, after the existing list (which should be derived from the system primary language list). So that would cause the system to fall back to Japanese fonts. If the Noto Sans CJK fonts are correctly configured as fallbacks for their respective language (they are in Fedora), then that all works, and you don't need to specify specific fonts or anything like that. Of course it's up to distros to configure language-specific fallback correctly, but I think that's something they would have to do to support the normal use case of the primary language being a CJK one, so hopefully they do that properly!

I think KDE should always add all the languages configured in Regional Settings > Language to ~/.fonts.conf or similar like that (including the primary one). Then whatever language the normal mechanisms choose will still take precedence, but then after that fontconfig will try the user-configured languages in Plasma, and then the font fallback order should be correct. So it will work like in Windows/Android.
Comment 11 Asahi Lina 2023-11-20 09:58:59 UTC
Also an unrelated note for other people who end up here: Firefox uses its own logic for this and you have to edit the `font.cjk_pref_fallback_order` pref to change the CJK fallback order. It's a bit unfortunate that that's a hidden preference...
Comment 12 Kelvie Wong 2024-10-13 21:18:39 UTC
Just ran into this as well, the equivalent for Chinese is (change ja to zh)

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "urn:fontconfig:fonts.dtd">
<fontconfig>
  <match target="pattern">
    <edit name="lang" mode="append"><string>zh</string></edit>
  </match>
</fontconfig>

(side note, this bugzilla really needs to support better markup, though I suppose this breaks plaintext email workflows)