Version: 1.2.0 (using KDE 4.4.2) OS: Linux Hi, I think the find duplicates feature on digi-kam is great, but it is so slow to go through my 20,000 photos, almost all of which have duplicates, and delete the duplicates. So it would be great to have a batch program to delete the duplicates, including a feature to keep the largest of the duplicates. Reproducible: Didn't try
I would like to request the same feature. Just add a button "auto-delete duplicates". For a more comfortable version one could add criteria which one to keep: 1) largest one as suggested above (highest resolution) 2) latest one 3) keep the one from a prefered folder (hierarchy) list
I use this bug as container for bugs with equivalent wishlist
*** Bug 372378 has been marked as a duplicate of this bug. ***
From https://bugs.kde.org/show_bug.cgi?id=372378 : piotergmoter@hotmail.com 2016-11-12 09:24:38 UTC Digikam has powerfull search funtcion, which finds duplicates in albums. Nice feature to have would be to *do* something after the search with the images. The most obvious action could be: delete the duplicates, but the list could go on to different scenarios. This is the real example which occured after I have imported from the mobile camera 1000 photos which could be imported previously, a years ago. They have different file names of course, so from filesystem point of view they are different. I would like to clean my albums, but in automated way and wonder what is possible. P. ---- Wolfgang Scheffner 2016-11-12 18:14:05 UTC Seems a bit difficult to me. How can an automated process decide which one of two identical images to process (delete or whatever)? Of course you could set the threshold to 100% and then say it doesn't matter, just process one of them. But 1. your search result gets very small with 100% and 2. the process would still need a rule to decide and that will most likely not match everybody's needs.
Similarity detection is great but completely useless for people starting out with DigiKam who have *lots* of duplicates. I bet this is a large majority of people especially if they use PhotoMove to structure there files (is preserves duplicate images). I don't understand why this is up for debate after 8 years. I have 100K images from merging multiple collections together. Thousands of them are 100% duplicates. It is impossible to delete them manually. I don't care what folder they are in, I just want them gone. > Seems a bit difficult to me This isn't difficult. Here's are two possible solutions. Both would be trivial to implement. The last one would at least allow the issue to be addressed externally. 1. Two new similarity options: [ ] Use largest image as reference image [ ] Hide reference images from results Checking these options would allow a person to select all "Ref. Images" on the left and then select all images on the right and then press delete. This doesn't work now because the reference image is displayed on the right side and may or may not be the largest (largest megapixel) image. 2. Allow the user to export the duplicate list of images to a csv file with some meta information and file path. This way I could write a script to delete duplicates myself. I could then choose to remove jpgs instead of cr2 or images with smaller megapixel sizes.
*** Bug 377523 has been marked as a duplicate of this bug. ***
same need for me, starting with Digikam I have a lot of duplicates coming from - too much phone pictures dumps - chat apps saved pictures dumps that are duplicates (with wrong metadatas) My 2 needs are also : - auto delete - choose an album as a prefered a higher priority source
Just found that other bug report https://bugs.kde.org/show_bug.cgi?id=388981 that is not really a duplicate but is very related to this request I also found an open source project that did the job for me https://dupeguru.voltaicideas.net/ Hope this can be usefull
*** Bug 430975 has been marked as a duplicate of this bug. ***
Wolfgang Scheffner 2016-11-12 18:14:05 UTC Seems a bit difficult to me. How can an automated process decide which one of two identical images to process (delete or whatever)? Of course you could set the threshold to 100% and then say it doesn't matter, just process one of them. But 1. your search result gets very small with 100% and 2. the process would still need a rule to decide and that will most likely not match everybody's needs. Hi Wolfgang, the dupeguru UI treat this problem by letting the user selecting which folder is the "master" one.