Created attachment 59372 [details] a test vcard (properly folded) Version: 4.5 (using KDE 4.5.5) OS: Linux If a contact information saved as a VCard 3.0 (RFC 2425 / 2426) file with multibyte encoding (utf-8 in my case) it is possible to have lines folded in the middle of multibyte character. Some bytes of a character left on one line, the rest are put on the next line. This makes difficult to read saved vcard with other libraries (python-vobject for example). I suppose this is a bug because RFC 2425 states following in "5.8.1. Line delimiting and folding": A logical line MAY be continued on the next physical line anywhere between two characters by inserting a CRLF immediately So splitting a line in the middle of the character is not compatible with RFC 2425 requirements. The cause of the bug is in kabc/vcardparser/vcardparser.cpp, function createVCard lines 291-301 (git revision 5ca796151e8fbf0e8b84574c9640a77af49c2c50). Folding is done on a byte level, after unicode characters are encoded as bytes. Reproducible: Always Steps to Reproduce: Set locale to utf-8 based. Create contact in kaddressbook with long note (70+ characters) in some language whose characters are encoded into multiple bytes. Export contact as vcard 3.0. Try to load a vcard in some other program. Actual Results: a Vcard exported by kaddressbook (look for broken characters): BEGIN:VCARD FN:test N:test;;;; NOTE:Длинный комментарий на русском языке\, чтобы в результате получилась строка бо� �ее 70 символов. UID:MeYEG83HLw VERSION:3.0 END:VCARD Expected Results: a properly folded vcard: BEGIN:VCARD FN:test N:test;;;; NOTE:Длинный комментарий на русском яз ыке\, чтобы в результате получилась с трока более 70 символов. UID:MeYEG83HLw VERSION:3.0 END:VCARD
*** Bug 320196 has been marked as a duplicate of this bug. ***
Git commit 63bbded8f55f2c539e0ec5942b362cd26fc77a46 by Martin Koller. Committed on 22/03/2014 at 10:59. Pushed by mkoller into branch 'KDE/4.13'. avoid splitting UTF-8 encoded character in the middle of encoded bytes The file format spec in RFC 6350 says: http://tools.ietf.org/html/rfc6350#section-3.2 Line Delimiting and Folding "Multi-octet characters MUST remain contiguous." This patch avoids splitting an UTF-8 encoded character in the middle of the encoded bytes FIXED-IN: 4.13 REVIEW: 116933 M +2 -0 kabc/vcardparser/testroundtrip.qrc A +14 -0 kabc/vcardparser/tests/vcard9.vcf A +14 -0 kabc/vcardparser/tests/vcard9.vcf.ref M +41 -3 kabc/vcardparser/vcardparser.cpp http://commits.kde.org/kdepimlibs/63bbded8f55f2c539e0ec5942b362cd26fc77a46