Bug 479181

Summary: Wish list: exact-duplicate scan based on 'uniqueHash'
Product: [Applications] digikam Reporter: Otto Hirr <ottohirr>
Component: Database-SimilarityAssignee: Digikam Developers <digikam-bugs-null>
Status: REPORTED ---    
Severity: wishlist CC: caulier.gilles
Priority: NOR Keywords: efficiency-and-performance, usability
Version First Reported In: 8.2.0   
Target Milestone: ---   
Platform: Other   
OS: Other   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Otto Hirr 2023-12-30 04:15:38 UTC
SUMMARY
***
The duplicate scanner should have option to find 'exact' duplicates which could be really fast without fingerprints.
***

The doc, https://docs.digikam.org/en/main_window/similarity_view.html#find-duplicates, states the following:

"... but it will take a long time too as it has to compare every image with any other image."

The 'Images' table has a column, 'uniqueHash', which is the hash of the bits in the file.

Files containing the same bits *should* have the same hashes.

A simple query to find duplicate values in this column would yield a list of unqueHash's that have duplicates, which can then be used in other table(s) to retrieve the containing directory and file names.

This query would be fast compared to the image comparison of similarities.

Selecting this option would then disable the other similarity options on the page.

I suspect that a large number of users simply want to find exact duplicates when loading files that have been stored in various places, maybe under various prior methods the user had implemented for managing their files.

This could present the file exact-duplicates in a similar manner as existing duplicate finder.

It's simply a modification to how those similar/exact-duplicates are found.

Best regards,

.. Otto
Otto Hirr