Bug 465897

Summary: Potential problem in passing AppData translations to appstream-data
Product: [Applications] Discover Reporter: Tyson Tan <tysontanx>
Component: discoverAssignee: Plasma Bugs List <plasma-bugs>
Status: ASSIGNED ---    
Severity: normal CC: aleixpol, aspotashev, matthias, nate, sitter
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Krita in Discover when system locale is set to Simplified Chinese
GIMP in Discover when system locale is set to Simplified Chinese
digiKam in Gnome Software on Fedora 37 with Simplified Chinese locale
GIMP in Gnome Software on Fedora 37 with Simplified Chinese locale
Deepin File Manager in Discover on Archlinux with Simplified Chinese locale
Audacious in Discover on Archlinux with Simplified Chinese locale

Description Tyson Tan 2023-02-17 08:31:56 UTC
First of all, this isn't an issue where AppData files haven't been translated. They have been translated for years, but never seemed to have made it into places like Discover and Gnome Software.

#########

I've tested with: 
1) Latest versions of ArchLinux and Ubuntu;
2) Simplified Chinese locale;
3) Discover and Gnome Software;

For example, in this environment, applications like Krita, digiKam, Kdenlive do not show up in Discover with Chinese descriptions. But they have been 100% translated by me since 2018, and their AppData translations are properly displayed on https://apps.kde.org/

https://apps.kde.org/zh-cn/krita/
https://apps.kde.org/zh-cn/digikam/
https://apps.kde.org/zh-cn/kdenlive/

I don't think this is likely to be a bug in Discover, or mis-packaging of appstream-data, because some KDE applications do have Chinese AppData translations, but they are very old. Most of their data came from a time before 2018. Other less active languages like Japanese also seem to have the same problem. Both ArchLinux and Ubuntu have the same problem.

It seems as if the AppData translations have stopped being passed to places like appstream-data, sometime before 2018. 

#########

I must admit that I do not really know how things works for appstream, but in this article:
https://blogs.gnome.org/uraeus/2022/06/10/how-to-get-your-application-to-show-up-in-gnome-software/

There is a paragraph:
"An optional gettext or QT translation domain which allows the AppStream generator to collect statistics on shipped application translations."

I wonder, is it possible for this thing to be mis-configured?

This issue has been plaguing us for years now, affecting the discoverability of apps in non-English languages. I have exhausted any lead at this point. Help would be greatly appreciated.
Comment 1 Harald Sitter 2023-02-17 12:51:20 UTC
Feels like a bug in libappstream to be honest. I also fail to get zh_CN using appstreamcli.

Moving bug to discover for inspection.
Comment 2 Tyson Tan 2023-02-18 04:25:16 UTC
*** Bug 437740 has been marked as a duplicate of this bug. ***
Comment 3 Matthias Klumpp 2023-02-19 14:32:05 UTC
What exactly is the issue here? Chinese *translations* of description text not showing up, or translation *statistics* not showing up?

For the latter, looking at Krita's file @ https://invent.kde.org/graphics/krita/-/blob/master/krita/org.kde.krita.appdata.xml it is simply missing a `translation` tag - that one is essential to read translation statistics, so add one as described in https://www.freedesktop.org/software/appstream/docs/chap-Metadata.html#tag-translation and localization statistics should show up.

As for Chinese translations, I do see them in https://appstream.debian.org/sid/main/metainfo/krita.html, so they are shipped to clients, and I can also dump that data with libappstream, so it should be available - I don't know however if Chinese characters are properly tokenized for fulltext search (which should not affect display though). I didn't set my locale to Chinese, but as far as I can see from some quick tests, this should be working...
Comment 4 Tyson Tan 2023-02-19 18:20:47 UTC
The translations of the description of apps (AppData entries) are not showing in Discover. The system's locale has been set to zh_CN.UTF-8 (Simplified Chinese).

I don't know what "statistics" means.

Note that every KDE application are having the same problem. Even the ones that still show translations in Discover, have very old translations. Only Gnome Applications seem to have updated translations in Discover.

Is it possible for our appdata template to be malformed, at some point before 2018?
Comment 5 Tyson Tan 2023-02-19 18:24:48 UTC
Created attachment 156505 [details]
Krita in Discover when system locale is set to Simplified Chinese
Comment 6 Tyson Tan 2023-02-19 18:25:19 UTC
Created attachment 156506 [details]
GIMP in Discover when system locale is set to Simplified Chinese
Comment 7 Tyson Tan 2023-02-19 18:34:29 UTC
At least for Archlinux, I think Discover get its AppData from this file?
https://archlinux.org/packages/extra/any/archlinux-appstream-data/

Source at:
https://github.com/archlinux/svntogit-packages/tree/packages/archlinux-appstream-data/trunk
Comment 8 Matthias Klumpp 2023-02-19 18:45:06 UTC
Ah, you are using Arch! Then this in likely an Arch Linux bug and they simply need to regenerate their data. 
Looking at the data for Krita, I see:
```xml
  <description xml:lang="zh-CN">
    <p>Krita 是一款功能齐全的数字绘画工作室软件。</p>
    <p>Krita 既适合起草,也适合上色细化。您可以轻松使用 Krita 从头到尾完成一副精美的画作。</p>
    <p>Krita 是绘制概念美术、漫画、纹理和贴图接景的理想工具。它支持多种色彩空间,如 8 位、16 位整数以及 16 位、32 位浮点通道的 RGB 和 CMYK 颜色模型。</p>
    <p>Krita 具有功能强大的笔刷引擎、种类繁多的滤镜以及便于操作的交互设计。您可以在 Krita 中高效自如地发挥创意。</p>
  </description>
  <description xml:lang="zh-TW">
    <p>Krita 是全功能的數位藝術工作室。</p>
    <p>它是素描和繪畫的完美選擇,並提供了一個從零開始建立數位繪畫檔的端到端解決方案。</p>
    <p>Krita 是創造概念藝術、漫畫、彩現紋理和場景繪畫的絕佳選擇。Krita 在 8 位元和 16 位元整數色版,以及 16 位元和 32 位元浮點色板中支援 RGB 和 CMYK 等多種色彩空間。</p>
    <p>使用先進的筆刷引擎、驚人的濾鏡和許多方便的功能來開心地繪畫,讓 Krita 擁有巨大的生產力。</p>
  </description>
```

In the newest upload from 2023-01-15 of that package. Do you have that version? Trabslations are ignored by appstream if less than three paragraphs are translated, zh-* has more, so translations should show up.

The statistics stuff is an AppStream feature that lets projects expose how much localization exists for any locale, so software centers can display a 2translated into your language" badge, so users can filter for apps that are available in their language, and not just English ones.
But I'm not even sure if Discover supports that yet.
Comment 9 Tyson Tan 2023-02-20 00:46:18 UTC
No, this is not a ArchLinux specific problem. Ubuntu has the same problem. 

I wasn't able to test Fedora because it does't contain Krita in its default repos, and it could not finish fetching the community repo from China.
Comment 10 Tyson Tan 2023-02-20 00:51:58 UTC
And yes, I'm using that version from Comment 7 and Comment 8. So the data is there, but translations won't show up. Which means there is got to be a bug here.
Comment 11 Tyson Tan 2023-02-20 02:00:08 UTC
Created attachment 156519 [details]
digiKam in Gnome Software on Fedora 37 with Simplified Chinese locale
Comment 12 Tyson Tan 2023-02-20 02:00:35 UTC
Created attachment 156520 [details]
GIMP in Gnome Software on Fedora 37 with Simplified Chinese locale
Comment 13 Tyson Tan 2023-02-20 02:09:48 UTC
I've managed to get Gnome Software refreshed on Fedora. The situation is largely the same, but with a small bit of difference. The screenshots can be viewed in Comment 11 and Comment 12 .

digiKam in this case has one line translated, which is the "content" string from the desktop entry:
digikam._desktop_.pot
#: core/app/main/org.kde.digikam.desktop:120
EN: Manage your photographs like a professional with the power of open source
CN: 自由开源的专业照片管理程序

The translation of this line is up to date, which was translated by me around 2021. So some of the new translations did get through (which was reflected in your Comment 8). However, the translations for the whole description is still missing.

Note that this is not just a KDE application issue. Most non-Gnome applications, like the stuff made by Deepin project, also has only this "content" line translated. I can confirm that on Ubuntu and Archlinux.
Comment 14 Tyson Tan 2023-02-20 02:12:51 UTC
Created attachment 156521 [details]
Deepin File Manager in Discover on Archlinux with Simplified Chinese locale
Comment 15 Tyson Tan 2023-02-20 02:13:33 UTC
Created attachment 156522 [details]
Audacious in Discover on Archlinux with Simplified Chinese locale
Comment 16 Tyson Tan 2023-02-20 02:20:51 UTC
Now in Comment 14 and Comment 15 you can see how Deepin File Manager and Audacious are both missing translations. 

The similarity of them being they all use Qt.

Audacious used to be a GTK2 application. It was later ported to Qt5. I clearly remember it had full translations in Gnome Software back when I was using Gnome. But now it has none on Arch (!), one line on Fedora (!).

Deepin is a native Chinese project, so they have always been fully translating their stuff since the very beginning, but they use using a very old version of Qt, which could be the reason for why they still have 1 line translated.

Could this actually be a bug/misconfiguration to how Qt handles AppData translation/its marking of the translations?
Comment 17 Tyson Tan 2023-02-20 02:32:18 UTC
I've tested more. 

Under Japanese locale, the situation is similar to Comment 16.

Under French and Spanish locale, Audacious is missing translations while digiKam has everything.

I'm pretty sure these 2 applications are popular enough to have everything always fully translated at all times. But something is preventing them from showing up.
Comment 18 Aleix Pol 2023-03-31 18:05:01 UTC
Thank you very much for such a thorough bug report!

I think I've found the problem. The xml:lang field is getting generated as "zh-CN" instead of "zh_CN". I'll see if I can find where this is generated.
Comment 19 Bug Janitor Service 2023-03-31 18:19:36 UTC
A possibly relevant merge request was started @ https://invent.kde.org/sysadmin/l10n-scripty/-/merge_requests/61
Comment 20 Tyson Tan 2023-04-01 07:15:21 UTC
(In reply to Aleix Pol from comment #18)
> Thank you very much for such a thorough bug report!
> 
> I think I've found the problem. The xml:lang field is getting generated as
> "zh-CN" instead of "zh_CN". I'll see if I can find where this is generated.

Thank you Aleix! :D