Created attachment 119136 [details] document to reproduce the bug SUMMARY Document metadata in Russian is unreadable (bad encoding?) STEPS TO REPRODUCE 1. Open the attached file in calligrawords. 2. Check window title (refer screenshot). 3. Check document properties (refer screenshot). OBSERVED RESULT Normal text for document title and author. EXPECTED RESULT Unreadable text in place of document title and author. SOFTWARE/OS VERSIONS Операционная система: Fedora 29 Версия KDE Plasma: 5.14.5 Версия Qt: 5.11.3 Версия KDE Frameworks: 5.55.0 Версия ядра: 4.20.4-200.fc29.x86_64 Архитектура: 64-битная Процессоры: 8 × Intel® Core™ i7-6700HQ CPU @ 2.60GHz Память: 15,4 ГиБ ОЗУ
Created attachment 119137 [details] screenshot
Created attachment 119138 [details] title of the same document as seen in LibreOffice
A possibly relevant merge request was started @ https://invent.kde.org/office/calligra/-/merge_requests/15
Git commit be82faae699790e8b5d4f68a2e9e2663ff40477e by Pierre Ducroquet. Committed on 13/02/2021 at 13:56. Pushed by ducroquet into branch 'master'. Support more than UTF-8/16 in word metadata import The meta-data import code was only considering two possible codepages: UTF-8 and UTF-16. Word documents tend to use local encoding, so while this behaviour seemed flawless with US/UK documents, completely different encodings were broken. Instead of considering only UTF-8 and UTF-16, use QTextCodec and try to handle as many encoding as possible that way, warning if they are not found. See https://bugs.kde.org/show_bug.cgi?id=406014 for example document M +13 -8 filters/words/msword-odf/document.cpp https://invent.kde.org/office/calligra/commit/be82faae699790e8b5d4f68a2e9e2663ff40477e
Thank you very much for your report Alexander!
Thanks for fixing!