Bug 438253 - Deleting a large number of tags takes much more than 24h
Summary: Deleting a large number of tags takes much more than 24h
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Tags-Manager (show other bugs)
Version: 7.2.0
Platform: Microsoft Windows Microsoft Windows
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-06-08 14:01 UTC by kai
Modified: 2023-05-01 16:07 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 8.1.0


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kai 2021-06-08 14:01:42 UTC
SUMMARY
I want to delete ca. 70,000 tags from the database (not from the images)
ca. 90,000 photos are in the database.
ca. 80,000 tags are in the database table tags.
ca. 80,000 tags are in the database table tagstree.

STEPS TO REPRODUCE
1. Start Tag Manager and select ca. 70,000 tags from the list of tags
2. right click on the selected tags and click "Delete"

OBSERVED RESULT
after 24 hours the process is still running and the application is not responding.
In the database table I see about 4 deletions per minute

EXPECTED RESULT
the deletion process should finish within a minute


SOFTWARE/OS VERSIONS
Windows: 10

ADDITIONAL INFORMATION
The database engine is MySQL
The database as well as the photos are hosted on a fast performing SSD.
The CPU has 6 cores (12 with HT)
RAM is 32 GB

My GUESS: 
The database uses a trigger on deletion in table tags, which removes the related entries from table tagtree. I am no DB expert, but this trigger seems heavy if executed for each deletion of one of the 70,000 to be deleted tags.

My Proposal:
Can you please send me a DELETE/JOIN statement that I could execute on the  directly against the database instead of using Tag Manager.
Comment 1 caulier.gilles 2021-06-08 15:29:25 UTC
I just tested a large amount of Tags deletion under Linux, and it fast as expected (few seconds)

I suspect something running in background under Windows which reduce file access to digiKam database as a virus protection.

Gilles Caulier
Comment 2 Maik Qualmann 2021-06-09 07:54:43 UTC
The delete tags trigger is really extremely slow here with my test tags table with almost 40,000 entries. Even under MySQL Command I cannot delete 1000 tags, I canceled it. I can optimize the trigger, now deleting 10,000 tags takes about 20 minutes. Not really nice either, but at least a lot better. But you also have to consider that deleting such a large number of tags rarely or never occurs with users.

Our old method of managing a TagsTree was also extremely slow, as the entire Tags Table columns "lft" and "rgt" always had to be recalculated.

It's amazing to me that MySQL is so slow here. 

Maik
Comment 3 kai 2021-06-09 18:22:03 UTC
Thanks for the extremely fast responses!
Deactivating AntiVirus did not help.
You mentioned that MySQL behaves slow here. Would you recommend to go back to SQLite?
The reason why I want to delete these tags is because digiKam takes ca. 4 minutes to start (just like ACDSee) and I assumed that the number of tags might be the reason. 
(Btw.: I have that high number of (useless) tags because GeoSetter writes the geo-position (lat and lon) into the ITPC as keywords. However, I do not need them there.)
If all fails: Is it possible you send me a SQL statement that deletes all tags "LIKE 'geo:%' from your DB that I can run without corrupting your DB/program logic?
Comment 4 Maik Qualmann 2021-06-09 20:51:05 UTC
I am working on a database update, another problem is that we did not create an index for the TagsTree.

Maik
Comment 5 Maik Qualmann 2021-06-15 18:25:42 UTC
Git commit abe6d6e3e04c916a3dbb202706a2f1f5c61a6f4b by Maik Qualmann.
Committed on 15/06/2021 at 18:24.
Pushed by mqualmann into branch 'master'.

add index to the TagsTree table for MySQL

M  +43   -1    core/data/database/dbconfig.xml.cmake.in
M  +28   -11   core/libs/database/coredb/coredbschemaupdater.cpp

https://invent.kde.org/graphics/digikam/commit/abe6d6e3e04c916a3dbb202706a2f1f5c61a6f4b
Comment 6 caulier.gilles 2023-05-01 07:24:55 UTC
@kai

digiKam 8.0.0 is out. This entry still valid with this release ?

Best regards

Gilles Caulier
Comment 7 kai 2023-05-01 11:15:12 UTC
Hello Gilles,

Sorry, I cannot test this, because
- I do not use digiKam anymore
- I removed the ITCP tags directly from the image files.

Kindly close the issue

Best regards
Kai

-----Original Message-----
From: bugzilla_noreply@kde.org <bugzilla_noreply@kde.org> 
Sent: Montag, 1. Mai 2023 09:25
To: kai.hackenberg@gmx.net
Subject: [digikam] [Bug 438253] Deleting a large number of tags takes much more than 24h

https://bugs.kde.org/show_bug.cgi?id=438253

--- Comment #6 from caulier.gilles@gmail.com --- @kai

digiKam 8.0.0 is out. This entry still valid with this release ?

Best regards

Gilles Caulier

--
You are receiving this mail because:
You reported the bug.=
Comment 8 caulier.gilles 2023-05-01 16:07:16 UTC
Not reproducible with 8.0.0. closed.