Created attachment 130909 [details] Bug demonstration SUMMARY Whenever a flag emoji is used with a text that is inferred to be right to left, it breaks, showing another flag or a flag with a question mark. This problem is observed in both editable text (as in a text editor for example) and non-editable text (as in the title of a Firefox window for example). STEPS TO REPRODUCE 1. Choose any flag emoji (this one for example 🇯🇵). 2. Open any application where you can enter text (Kate for example) and paste the flag emoji. 3. Enter text using a right-to-left language (Arabic or Hebrew for example) either before or after the flag emoji. OBSERVED RESULT The flag emoji breaks, showing a flag with a question mark in this case. EXPECTED RESULT The flag emoji shouldn't change just because the text around it was inferred to be right to left. SOFTWARE/OS VERSIONS Linux/KDE Plasma: Kubuntu 20.04 KDE Plasma Version: 5.18.5 KDE Frameworks Version: 5.68.0 Qt Version: 5.12.8 ADDITIONAL INFORMATION 1- Note that this problem only occurs if the text with the flag emoji is inferred to be right to left. This inference depends on the first character typed other than the emoji. If this character is right to left, the text is inferred to be right to left, and the emoji breaks. If the first character typed other than the emoji is left to right, the text is inferred to be left to right, and the emoji won't break even if subsequently right-to-left characters were used. 2- This problem is peculiar to flag emojis as far as I can tell. All other emojis that I've encountered exhibit no problem with right-to-left text. 3- This problem is universal to KDE as far as I can tell, but is not universal to all of the system (this problem isn't present in Firefox for example).
> This problem is universal to KDE Even in simple QLineEdit widgets? Then it's a Qt bug. Reassigning to Kate developer because the Kate text view may have it's own layouting algorithms.
(In reply to Christoph Feck from comment #1) > > This problem is universal to KDE > > Even in simple QLineEdit widgets? Then it's a Qt bug. > > Reassigning to Kate developer because the Kate text view may have it's own > layouting algorithms. Yes. For example, the same behavior is observed in the search box of the application launcher (see the new screenshots attached). I figured out the problem at the conceptual level, but I honestly lack the requisite knowledge to point at the piece of code that causes it. Flag emojis are not a single unicode character like other emojis. For example, Egypt's flag emoji is made up of two unicode characters: 🇪 and 🇬. When these two characters are typed in sequence without a space, the font that handles emojis interprets them as one unit and shows the corresponding flag. So 🇪🇬 is in fact 🇪 followed by a 🇬 without a space. The problem is with the way KDE (or Qt as you pointed out) interprets LTR text when used with RTL text. When LTR text is used with RTL text, LTR text should still be read from left to right. This is the way it's interpreted in every piece of software I've ever used (firefox for example), and the way it's actually read in real life (for Arabic at least). For example, if I encounter the sentence "cat مرحباً", I will read it as "hello cat", not as "hello tac". Qt, however, will interpret it as "hello tac". So, for example, if Qt tries to read "🇪🇬 مرحباً", it will interpret the emoji flag characters RTL, and will read 🇬 then 🇪, prompting the font to show Georgia's flag (🇬🇪) instead of Egypt's. Egypt is lucky to have another flag substituted for its flag. Japan, on the other hand, will have its flag substituted by a question mark flag (depending on the font), because there's no country with code PJ (JP: h, PJ: 🇵🇯). This is also way single character emojis aren't affected by this problem. A single character is read the same way LTR or RTL.
Created attachment 131395 [details] Flag emoji working in QLineEdit Widget
Created attachment 131396 [details] Flag emoji not working in QLineEdit widget
(In reply to I3rav3 from comment #2) > (In reply to Christoph Feck from comment #1) > (JP: 🇯🇵, PJ:🇵🇯). > This is also WHY single character emojis aren't affected by this problem. Sorry. Correcting typos.
I have tested this by copying the RTL text in Comment 2 and trying to paste the Australian flag in the document. An entirely different flag appeared.
(In reply to Justin Zobel from comment #6) > I have tested this by copying the RTL text in Comment 2 and trying to paste > the Australian flag in the document. An entirely different flag appeared. That must have been Ukraine's flag (AU when read RTL is UA, Ukraine's flag code.) I have since moved to gnome (not because of this problem, performance issues and better support), and have faced the exact same problem. The problem is definitely due to something upstream, and any helpful advice regarding where to report it would be really appreciated.
I've presented this issue to the Hebrew Linux community: https://www.facebook.com/groups/linux.il/permalink/3459130597506838/ We came to the conclusion that gedit, Kate and mousepad in different versions are all affected by this bug. The test case is: ישראל 🇮🇱 ליכטנשטיין 🇱🇮 The first one reads Israel and the Israeli flag afterwards The second one reads Liechtenstein with the corresponding flag There has been some suspicions regarding Pango but I couldn't verify that. Pretty funny case.
(In reply to Yaron Shahrabani from comment #8) > I've presented this issue to the Hebrew Linux community: > https://www.facebook.com/groups/linux.il/permalink/3459130597506838/ > > We came to the conclusion that gedit, Kate and mousepad in different > versions are all affected by this bug. > > The test case is: > ישראל 🇮🇱 > ליכטנשטיין 🇱🇮 > The first one reads Israel and the Israeli flag afterwards > The second one reads Liechtenstein with the corresponding flag > > There has been some suspicions regarding Pango but I couldn't verify that. > > Pretty funny case. What's even funnier is that when I visited your Facebook group, the first post I saw was someone apologizing for writing in English because of Facebook's atrocious RTL support. So yeah, I wouldn't hold this problem against whatever open-source library is causing it, when much larger companies with much deeper pockets still can't figure RTL out. It's been ages since I've used Twitter, but I remember its RTL support to be exceptionally good. It even manages to not mangle RTL text when the interface itself is LTR, something that Google doesn't even seem to be aware it's doing. I would still like this problem to be fixed, and would appreciate any pointers regarding reporting somewhere where someone can actually figure it. It also seems worth repeating that this problem isn't universal to Linux. It doesn't occur for example in Firefox, Thunderbird, PyCharm or Sublime Text from what I've been able to test.
(In reply to I3rav3 from comment #9) > pointers regarding reporting [it] somewhere where someone can actually figure it [out].
This needs to be reported to Qt, nothing can be done about it in Kate (unless we start doing text rendering from scratch)