Bug 353555 - Database backend is much slower than standard kmy format
Summary: Database backend is much slower than standard kmy format
Status: REPORTED
Alias: None
Product: kmymoney
Classification: Applications
Component: database (show other bugs)
Version: 4.7.2
Platform: Kubuntu Linux
: NOR major
Target Milestone: ---
Assignee: KMyMoney Devel Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-05 08:57 UTC by Martin Tlustos
Modified: 2021-03-28 13:15 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Tlustos 2015-10-05 08:57:34 UTC
I would need the sqlite format for interoperability with other programs (extract address info for Mailings etc.), but operation with sqlite is much slower than with the native kmy format (which in turn is not readable by other programs).

Reproducible: Always
Comment 1 Christian David 2015-10-07 10:38:42 UTC
I can confirm this. But the changes to fix this issue are too complex for the available developer time. The kmy format is a gzipped xml file. It is not hard to parse it, maybe you should give it a try. 

Btw: KMyMoney 4.7 has some improvements regarding the database backend. It should be bit faster (only a bit!).
Comment 2 flywire 2015-11-06 21:59:11 UTC
This is disappointing because readability by other programs through a database is one of the big benefits of this program; a major reason for me to leave a propriety software package. Much easier in a database than xml.

For example, customised processing of tagged transactions externally to integrate the finances to your project and a whole lot of other data such as mapped output. Now that's value add!!
Comment 3 Justin Zobel 2021-03-10 00:16:24 UTC
Thank you for the bug report.

As this report hasn't seen any changes in 5 years or more, we ask if you can please confirm that the issue still persists.

If this bug is no longer persisting or relevant please change the status to resolved.
Comment 4 Jack 2021-03-11 19:41:34 UTC
I've been using Perl's XML handling modules to produce some specific reports, and although there was a learning curve, I'm not sure the ultimate code was any harder to create than it would have been using SQL.  However, I wonder why the OP found sqlite so much slower than accessing the .kmy or .xml file.  There is the time course of having to pull data from the internal structures for every sql call, compared to parsing once into your own internal representation, but I can't imagine it's slow enough to really notice.

I'm going to set to NEEDSINFO in case the OP can confirm that the slowness issue is still a problem for him.  Otherwise, this will eventually get closed as WORKSFORME.
Comment 5 jesse 2021-03-12 17:55:27 UTC
I am a DBA by profession but have never looked at the KMY database set up. If the queries are doing push-down optimization, I would expect the database to respond quickly. Indexes should be examined to make sure they are set up for the right column per the queries. The hard part is that when you give 100 people access to query however they want, they will come up with 100 different way to query. You can't really optimize for everyone. You have to optimize for a few usage patterns. 

Perhaps if there is documentation that specifies which columns are indexed, users could tune their queries to leverage those indexes. If using a database without indexes, usually they have other objects to help optimize, like zonemaps for Netezza or synopsis tables for DB2. 

It makes sense to me that if the code parsed the xml file and loaded it into memory in its own structure, that will be much faster to access than any database. However, depending on how data sets are being joined or searched, the database call may be faster still since it is optimized to do those joins. 

Is there documentation for KMY on how the database is deploy and what indexes it creates? 

thanks, 

JV
Comment 6 Jack 2021-03-12 18:21:41 UTC
First, i don't know why, but I had interpreted the original post as 
claiming that accessing the sqlite database outside of KMM to be slow.  
Normal operations should be the same speed with any back end, possibly 
differing in the time to open or save a data file (see below.).  If the 
OP was actually seeing KMM run much slower with sqlite compared to 
xml/kmy, then that should be tracked down as to the cause, since I don't 
think it should be the case.

KMM does not really use any sql during it's normal operations, no matter 
the back end data store.  Whether xml or sql, it reads the data into 
memory, and only writes it out when you do an explicit file/save (or an 
automatic save.)   In all cases, it wipes the data and writes it all 
from scratch.  (I assume someone will correct me if that's not really 
correct.)  I also don't think KMM uses any indexes, and it uses the same 
sql code for sqlite, sqlcipher, mysql (which works for mariadb), and 
postgresql, with some specific tweaks where the differences matter.

As far as I know, the only real documentation of the database layout is 
in the code for creating the tables.

Note that there have been multiple discussions over the years of whether 
KMM should move to exclusively use an sql back end and which one, and 
whether it's use of sql should be "live" instead of only on save.  
However, I don't think any changes in those areas are likely in the near 
to mid term.

Jack
Comment 7 jesse 2021-03-13 16:19:07 UTC
I interpreted the question as attempting to use the sqlite database through external programs. 

Interesting; Is there a reason why it uses the database as a file save(recreates the tables and loads the data each time)? Was the intention to allow users to query the database when KMY is not in use?

It is likely that the database does not have any built-in indexes. https://database.guide/list-indexes-in-sqlite-database/
Can the OP or someone with a sqlite database see if that commands brings back a list of indexes? 

I guess the next step would be to figure out if the KMY code recreates the entire database tables or only truncate and reload the tables. If it recreates the tables, and there no indexes, each time it would need to recreate indexes too. If it only truncates the tables, then the users can create the indexes they need for their custom external applications. Statistics on the database may be needed to help the optimizer keep the database response times fast. 

jv
Comment 8 Bug Janitor Service 2021-03-28 04:33:33 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 9 jesse 2021-03-28 13:15:15 UTC
I still have not had time to try to figure out if the database has indexes or not. I don't use the application with the database option. 

From what Martin said, the real need is based on using external programs to query the sqlite database. Without a way to know all the use cases, it may be best to either include documentation in the manual as to how and when a user would want to add indexes for their querying, or to add a few indexes to help our 80% of the use cases. 

However, I think the decision should be made as to whether saving to the sqlite format was intended for this purpose or not. 

If not, then perhaps there are no further changes required.