Bug 305440

Summary: CWOP: Wrong encoding conversion for titles in MS Office documents
Product: calligrawords Reporter: Yuri Chornoivan <yurchor>
Component: docAssignee: Matus Uzak <matus.uzak>
Severity: normal CC: matus.uzak
Priority: NOR    
Version: Git   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
URL: http://dl.dropbox.com/u/55247264/test.doc
Latest Commit: Version Fixed In:
Attachments: Test file (created with Libreoffice Writer 3.5)

Description Yuri Chornoivan 2012-08-19 15:50:58 UTC
Conversion that is applied to Cyrillic titles in MS Office documents makes them unreadable (looks like the usual results of UTF-8 -> Win 1252 conversion).

Libreoffice shows the titles of such documents (File->Properties...) in a readable way.

Reproducible: Always

Steps to Reproduce:
1. Open a file with a Cyrillic title.
2. Look at the Words window title (it should show something like ÐÑазок).
3. Choose File->Document information. "Title:" on "General" page should show the wrong encoded title of the document.
Actual Results:  
Wrong encoded title (ÐÑазок)

Expected Results:  
Readable title (Зразок)
Comment 1 Yuri Chornoivan 2012-08-19 15:52:12 UTC
Created attachment 73300 [details]
Test file (created with Libreoffice Writer 3.5)
Comment 2 Matus Uzak 2013-03-09 15:08:22 UTC
Git commit e2a8cdd21d09e37856e3375e79b812368320c96a by Matus Uzak.
Committed on 09/03/2013 at 12:42.
Pushed by uzak into branch 'master'.

Handle UTF-8/UTF-16 encoding when processing Document Information.

M  +2    -1    filters/libmso/msoleps.h
M  +15   -1    filters/words/msword-odf/document.cpp