Version: akonadi-server-1.1.1/akonadi-4.2.1 (using KDE 4.2.1)
Compiler: gcc version 4.1.2 (Gentoo 4.1.2 p1.3)
Installed from: Gentoo Packages
The (KDE local) Arkonadi MySQL tables use the Latin_1 charset.
This limits the use of KDE-PIM/Akonadi to West-European languages.
Please use UTF-8 with utf8_general_ci collation instead.
KDE ist great!
Where do you see actual bugs caused by this and how do you trigger them?
We should have unit tests that cover non-latin1 strings for all fields that can contain them, which work here.
I have not found a actual bug but realized this when I tried to figure out the structure of the database. It is no bug but a design error.
Adding unit tests for Latin_1 will not solve the problem. If you want to store e.g. asian characters in a Latin_1 database you'll get garbled characters anyway.
The character set of Latin_1 is smaller than UTF-8/Unicode. So you will loose information by all means when transforming unicode text to Latin_1. Using Latin_1 in a backend makes KDE unusable for all non-west-european users.
Unicode was created to solve charset problems. Windoze and Java use UTF-16, which misses some asian languages. UTF-32 covers nearly all known characters on this planet (even hieroglyphs, mathematical and musical notation), but needs four bytes for each character. Because of that UNIX systems switch to UTF-8 which is a variable-length character encoding form of UTF-32. The IETF requires all new protocols to support UTF-8 (RFC 2277).
Because of that KDE should require the global use of UTF-8 in the programming guide lines.
See http://en.wikipedia.org/wiki/UTF-8 for further infromation.
I know what Unicode and UTF-8 is. KDE actually mandates the use of that for user-visible strings. And if you check the database schema closely you will see that columns containing such data (such as CollectionTable.name) use in fact UTF-8 encoding.
The remaining columns however contain internal data which cannot contain Unicode (eg. mimetypes). Using the (slightly slower) UTF-8 encoding is thus not needed there, Latin1 does the job just fine.
So, unless there are real bugs, I would not want to change anything there, risking to do more damage than good.