278411 – Add collation sequence for unicode characters to the SQLite driver and make it the default

Bug 278411 - Add collation sequence for unicode characters to the SQLite driver and make it the default

Summary: Add collation sequence for unicode characters to the SQLite driver and make i...

Status:	CLOSED FIXED

Alias:	None

Product:	KEXI
Classification:	Applications
Component:	KexiDB (show other bugs)
Version:	2.4.x
Platform:	Compiled Sources Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Jarosław Staniek

URL:
Keywords:

Depends on:
Blocks:	289293 289395
	Show dependency tree / graph

Reported:	2011-07-24 19:41 UTC by Dimitrios T Tanis
Modified:	2012-08-11 11:45 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed In:	2.4 beta6 (Calligra 2.4 beta6)
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Dimitrios T Tanis 2011-07-24 19:41:49 UTC

Version:           3.0.x (Calligra 3.0.x) (using KDE 4.6.0) 
OS:                Linux

Currently there is no way to control how the columns are sorted. Every column collation is defaulted to BINARY. This gives unexpected results when eg. sorting accented vowels.

Reproducible: Didn't try



Expected Results:  
User should be able to assign different Collating sequence for each column as it is supported by SQLite, to control the way columns are sorted.

Maybe later on support Creating Collation Sequences as per: http://www.sqlite.org/c3ref/create_collation.html

Comment 1 Jarosław Staniek 2011-07-25 19:21:00 UTC

Last I checked, text columns sorting (most interesting here) were defaulted to unicode collation.

From http://www.sqlite.org/lang_select.html#orderby this applies:

"3. Otherwise, if the ORDER BY expression is a column or an alias of an expression that is a column, then the default collation sequence for the column is used."

Anyway, I am accepting the wish for increased flexibility.

Comment 2 Dimitrios T Tanis 2011-07-25 19:27:18 UTC

I agree but, there is now way to define default collation sequences for columns, so each column collation defaults to BINARY.

http://www.sqlite.org/datatype3.html
Paragraph 6.0 and 6.1 "Every column of every table has an associated collating function. If no collating function is explicitly defined, then the collating function defaults to BINARY"

Am I missing something here?

Comment 3 Jarosław Staniek 2011-07-25 19:36:33 UTC

Thanks. I need to test this than and if possible plug the well defined Unicode collation function.

Changed Version to 2.4.x.

Comment 4 Jarosław Staniek 2011-12-15 20:20:34 UTC

This task is one of more laborous out there since research was needed and packaging aspects also appear here.

For example, here's the full followup thread on sqlite packaging:

http://lists.kde.org/?t=132388156300005&r=1&w=1

I am planning to use ICU (http://icu-project.org) for high quality collating using sqlite extension (it's even already shipped with sqlite code). Other areas would benefit, e.g. case insensitive, localized LIKE operator. Very good features for desktop databases that usually expose the data to user. I think many advanced server databases do not have or enable this kind of features.

Comment 5 Jarosław Staniek 2011-12-18 17:21:26 UTC

Splitted out a 'wish' part of this big to https://bugs.kde.org/show_bug.cgi?id=289293 and changed topic to 'Add collation sequence for unicode characters and make it the default'

Comment 6 Jarosław Staniek 2011-12-19 23:14:08 UTC

Git commit 8f6f75d5b9ebf048f196ea5f7320b99c56b2d9cd by Jaroslaw Staniek.
Committed on 20/12/2011 at 00:06.
Pushed by staniek into branch 'master'.

KexiDB: Use ICU for high quality collating in columns using sqlite ext.

BUG:278411

DIGEST: (KexiDB) Use ICU (http://icu-project.org) for high quality collating in unicode text columns using sqlite extension.

M  +19   -5    CMakeLists.txt
M  +66   -12   cmake/modules/FindCalligraSqlite.cmake
A  +82   -0    cmake/modules/FindICU.cmake
M  +12   -2    kexi/kexidb/driver.h
M  +2    -1    kexi/kexidb/drivers/sqlite/CMakeLists.txt
A  +10   -0    kexi/kexidb/drivers/sqlite/icu/CMakeLists.txt
A  +170  -0    kexi/kexidb/drivers/sqlite/icu/README.txt
A  +501  -0    kexi/kexidb/drivers/sqlite/icu/icu.c     [License: Public Domain]
A  +27   -0    kexi/kexidb/drivers/sqlite/icu/sqliteicu.h     [License: Public Domain]
M  +56   -6    kexi/kexidb/drivers/sqlite/sqliteconnection.cpp
M  +4    -0    kexi/kexidb/drivers/sqlite/sqliteconnection.h
M  +7    -0    kexi/kexidb/drivers/sqlite/sqliteconnection_p.h
M  +9    -2    kexi/kexidb/drivers/sqlite/sqlitedriver.cpp
M  +10   -1    kexi/kexidb/drivers/sqlite/sqlitedriver.h
M  +9    -3    kexi/kexidb/queryschema.cpp
M  +2    -2    kexi/kexidb/queryschema.h

http://commits.kde.org/calligra/8f6f75d5b9ebf048f196ea5f7320b99c56b2d9cd

Comment 7 Jarosław Staniek 2012-06-25 22:35:20 UTC

Git commit a3331e48e5ffdf7a1c93adcc80349d47fc67e2c9 by Jaroslaw Staniek.
Committed on 28/05/2012 at 00:02.
Pushed by staniek into branch 'master'.

Add KexiDB fix: Use ICU for better collating in text columns

*Add KexiDB fix: Use ICU (http://icu-project.org) for high quality collating in unicode text columns using sqlite extension
**(2011-12-20 calligra master commit 8f6f75d5b9ebf048)

M  +22   -8    Drivers/CMakeLists.txt
M  +2    -0    Drivers/sqlite/CMakeLists.txt
M  +71   -10   Drivers/sqlite/SqliteConnection.cpp
M  +7    -1    Drivers/sqlite/SqliteConnection.h
M  +11   -5    Drivers/sqlite/SqliteConnection_p.h
M  +12   -4    Drivers/sqlite/SqliteDriver.cpp
M  +10   -1    Drivers/sqlite/SqliteDriver.h
A  +11   -0    Drivers/sqlite/icu/CMakeLists.txt
A  +170  -0    Drivers/sqlite/icu/README.txt
A  +503  -0    Drivers/sqlite/icu/icu.c     [License: Public Domain]
A  +31   -0    Drivers/sqlite/icu/sqliteicu.h     [License: Public Domain]
M  +11   -1    Predicate/Driver.h
M  +5    -9    Predicate/DriverManager.cpp
M  +11   -4    Predicate/QuerySchema.cpp
M  +1    -1    Predicate/QuerySchema.h
M  +12   -0    Predicate/Utils.cpp
M  +6    -0    Predicate/Utils.h
A  +82   -0    cmake/modules/FindICU.cmake
M  +67   -13   cmake/modules/FindSqlite.cmake

http://commits.kde.org/predicate/a3331e48e5ffdf7a1c93adcc80349d47fc67e2c9