The registered mark is converted by the manual page slave to a Chinese ideograph. Reproducible: Always Steps to Reproduce: 1. { man selinux; } 2. { kioclient cat man:/selinux; } 3. Actual Results: 1. Type Enforcement® 2. Type Enforcement速 Expected Results: 2. Type Enforcement® LC_CTYPE="pl_PL.UTF-8"
See bug 141340 In brief: as man page files do not define in which encoding they are written, we try to auto-detect the encoding, which fails in this case. (The guessed encoding is here "EUC-JP" ...) Not sure how we could fix this.
*** Bug 329966 has been marked as a duplicate of this bug. ***
*** Bug 337479 has been marked as a duplicate of this bug. ***
Git commit 3208955e66c48b07281271933c2be5e49328720f by Martin Koller. Committed on 08/01/2015 at 18:18. Pushed by mkoller into branch 'Applications/14.12'. Do not use KEncodingProber - it gives false results; Try dirname or UTF8 The auto-detection of the man page file content with KEncodingProber was not successful - there are some bug reports showing it does not work reliable - often giving EUC-JP or gb18030 as encoding, which is wrong. I now try to find the encoding inside the man page file (according manconv) or from the name of the directory in which the file resides. However, on my openSuse system, neither the definition inside nor the directory name tells me it's UTF-8, but all pages are in UTF-8. Therefore I now use UTF-8 as default, which can be overridden with the env-var MAN_ICONV_INPUT_CHARSET FIXED-IN: 14.12.1 M +9 -18 kioslave/man/kio_man.cpp M +92 -20 kioslave/man/man2html.cpp M +6 -0 kioslave/man/man2html.h http://commits.kde.org/kde-runtime/3208955e66c48b07281271933c2be5e49328720f
Still present in kio-extras 17.04.2
I cannot reproduce this issue with kio-extras 18.04.2: when I execute e.g kioclient5 cat man:/usr/share/man/es/man1/ark.1.gz and open the output in a browser, all special characters are displayed correctly. Can anybody confirm that this is no longer an issue?
I can reproduce it (kio_man version 18.8.0) man:/selinux(8)
Git commit 1c45ddbe94c3fdfedf35f801ddfeeab6d17f2cc4 by Martin Koller. Committed on 24/08/2018 at 15:19. Pushed by mkoller into branch 'master'. Fwd port: Do not use KEncodingProber - it gives false results forward port of 3208955e66c48b07281271933c2be5e49328720f from old kde-runtime repo Original commit text: Do not use KEncodingProber - it gives false results; Try dirname or UTF8 The auto-detection of the man page file content with KEncodingProber was not successful - there are some bug reports showing it does not work reliable - often giving EUC-JP or gb18030 as encoding, which is wrong. I now try to find the encoding inside the man page file (according manconv) or from the name of the directory in which the file resides. However, on my openSuse system, neither the definition inside nor the directory name tells me it's UTF-8, but all pages are in UTF-8. Therefore I now use UTF-8 as default, which can be overridden with the env-var MAN_ICONV_INPUT_CHARSET M +9 -18 man/kio_man.cpp M +89 -25 man/man2html.cpp M +6 -0 man/man2html.h M +1 -1 man/tests/CMakeLists.txt https://commits.kde.org/kio-extras/1c45ddbe94c3fdfedf35f801ddfeeab6d17f2cc4
Note: the encoding prober in kcodecs-5.55.0 gives me UTF-8 at 99% as expected when fed with the file hunspell.1 (unzipped). The unzipped file does not start with a BOM.