Bug 465101 - Similarity search range not limited by selected parameters
Summary: Similarity search range not limited by selected parameters
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Searches-Similarity (other bugs)
Version First Reported In: 7.9.0
Platform: Mint (Ubuntu based) Linux
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-01 03:49 UTC by richardames
Modified: 2023-02-03 20:14 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 7.10.0
Sentry Crash Report:


Attachments
Screenshot showing range entered and results not matching. (205.77 KB, image/png)
2023-02-01 03:49 UTC, richardames
Details

Note You need to log in before you can comment on or make changes to this bug.
Description richardames 2023-02-01 03:49:04 UTC
Created attachment 155838 [details]
Screenshot showing range entered and results not matching.

SUMMARY
***
When you select a range for the Similarity search using the boxes the results are not limited by this range. See attached screenshot.
***


STEPS TO REPRODUCE
1. On a database with known similarities of varying match select a range of 99-100% or even 100-100%
2. Click Find Duplicates
3. 

OBSERVED RESULT
See screenshot where selection was 99-100% but results show 0.96 upwards. In fact if you sort the Avg Similarity column my database has values as low as 0.67. 

EXPECTED RESULT
Expected to only see 0.99 or 1.00 results

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
KDE Flatpak runtime 5.15-21.08 (x86_64)
KDE Plasma Version: Can't find this???
KDE Frameworks Version: 5.101.0
Qt Version: 5.15.8 (built against 5.15.8)

ADDITIONAL INFORMATION
Let me know if any other info required.
Comment 1 caulier.gilles 2023-02-01 06:30:12 UTC
If you switch on the debug traces and run digiKam from a console, did you seen something special when you updates similarity results ?

See this page for details about debug mode : https://www.digikam.org/contribute/

Gilles Caulier
Comment 2 caulier.gilles 2023-02-01 06:30:36 UTC
Also, which kind of database did you uses ?
Comment 3 Maik Qualmann 2023-02-01 07:21:11 UTC
The header is labeled "average" similarity. It adds the similarities relative to the reference image and then divides by the number of items in the list. So it's not the adjusted similarity range value.

https://invent.kde.org/graphics/digikam/-/blob/master/core/utilities/fuzzysearch/findduplicatesalbumitem.cpp#L148

Maik
Comment 4 richardames 2023-02-01 07:59:08 UTC
(In reply to caulier.gilles from comment #1)
> If you switch on the debug traces and run digiKam from a console, did you
> seen something special when you updates similarity results ?
> 
> See this page for details about debug mode :
> https://www.digikam.org/contribute/
> 
> Gilles Caulier

Hi Gilles
I'm a bit of a newbie to Linux and don't use the console normally so trying figure this out from the instructions. I'm assuming you mean this bit:-
------
Linux host
Just run digiKam from the terminal command line to capture the text traces generated by the application. Note that you need to turn on before all debug traces from digiKam with QT_LOGGING_RULES environment variable.
    export QT_LOGGING_RULES="digikam*=true"
    digikam
----------------
First command runs Ok but the second comes up with the error:-
richard@Linux-Box:~$ export QT_LOGGING_RULES="digikam*=true"
richard@Linux-Box:~$ digikam
Command 'digikam' not found, but can be installed with:
sudo apt install digikam
richard@Linux-Box:~$ 

As Digikam is very definitely installed I assume I'm doing something wrong.

A follow up when I get Digikam running is where does it put the text, in a file, if so where or is it just in the console window and I need to copy and paste from there.

Thanks
Comment 5 richardames 2023-02-01 07:59:36 UTC
(In reply to caulier.gilles from comment #2)
> Also, which kind of database did you uses ?

Database is SQLite
Comment 6 richardames 2023-02-01 08:18:56 UTC
(In reply to Maik Qualmann from comment #3)
> The header is labeled "average" similarity. It adds the similarities
> relative to the reference image and then divides by the number of items in
> the list. So it's not the adjusted similarity range value.
> 
> https://invent.kde.org/graphics/digikam/-/blob/master/core/utilities/
> fuzzysearch/findduplicatesalbumitem.cpp#L148
> 
> Maik

Ooookay I had a look at that link but I haven't done any programming for years other than writing  a bit of SQL so it would take me a while to find my way round that code.

I'm not sure what you are saying but I am fairly sure on previous versions (and I used to use Digikam on Windows for a few years) that the results displayed were between the settings entered in the Range criteria. 

Also although my screenshot only shows 2 items for each result that's because I initially sorted by the Items column descending and then the Avg Similarity also descending and dealt with most results with large number of items and there were results with 10 items and the Avg Similarity was 1.00 whereas you seem to imply the more Items there are the lower the number.

In case you are wondering why I have so many duplicates, I stuffed up and imported a whole load of images from some different sources that I didn't realise were actually the same. Now I'm cleaning up the mess :-)

I would think from an end user point of view if I enter criteria in a search I expect the results to be within that criteria.

Cheers, Richard
Comment 7 caulier.gilles 2023-02-01 10:10:16 UTC
Hi all,

Please review my last changes in the online documentation section for the similarity search tools:

https://docs.digikam.org/en/main_window/similarity_view.html

Thanks in advance

Gilles
Comment 8 Maik Qualmann 2023-02-01 12:38:21 UTC
I'll debug it more closely, I think I can see incorrect bracketing in the calculation in the Haar interface class, because the average can't be below the lowest value either.

Maik
Comment 9 Maik Qualmann 2023-02-02 21:05:53 UTC
Git commit 8f9d2bbd83c3a739b77dd169e790069a1784d361 by Maik Qualmann.
Committed on 02/02/2023 at 21:04.
Pushed by mqualmann into branch 'master'.

fix similarity search results

M  +13   -4    core/libs/database/haar/haariface.cpp
M  +2    -1    core/libs/database/haar/haariface_p.cpp
M  +4    -0    core/libs/database/haar/haariface_p.h
M  +7    -0    core/libs/database/similaritydb/similaritydb.cpp
M  +6    -0    core/libs/database/similaritydb/similaritydb.h

https://invent.kde.org/graphics/digikam/commit/8f9d2bbd83c3a739b77dd169e790069a1784d361
Comment 10 Maik Qualmann 2023-02-02 21:06:27 UTC
Due to the error, the results were heavily distorted, depending on the "old" results in the database, images were displayed that did not match the set search range, or similar images were not displayed at all. I want to backport the fix to digiKam-7.10.0.

Maik
Comment 11 Maik Qualmann 2023-02-02 21:12:27 UTC
Git commit 0be664f67f0269725189316482087a303d0957f6 by Maik Qualmann.
Committed on 02/02/2023 at 21:11.
Pushed by mqualmann into branch 'qt5-maintenance'.

backport fix to similarity search results
FIXED-IN: 7.10.0

M  +2    -1    NEWS
M  +13   -4    core/libs/database/haar/haariface.cpp
M  +2    -1    core/libs/database/haar/haariface_p.cpp
M  +4    -0    core/libs/database/haar/haariface_p.h
M  +7    -0    core/libs/database/similaritydb/similaritydb.cpp
M  +6    -0    core/libs/database/similaritydb/similaritydb.h

https://invent.kde.org/graphics/digikam/commit/0be664f67f0269725189316482087a303d0957f6
Comment 12 Maik Qualmann 2023-02-03 07:19:27 UTC
Git commit 5fa7b0e402745d48c5a694fe34f4f6f274d74b80 by Maik Qualmann.
Committed on 03/02/2023 at 07:18.
Pushed by mqualmann into branch 'master'.

Revert "fix similarity search results"

M  +4    -13   core/libs/database/haar/haariface.cpp
M  +1    -2    core/libs/database/haar/haariface_p.cpp
M  +0    -4    core/libs/database/haar/haariface_p.h
M  +0    -7    core/libs/database/similaritydb/similaritydb.cpp
M  +0    -6    core/libs/database/similaritydb/similaritydb.h

https://invent.kde.org/graphics/digikam/commit/5fa7b0e402745d48c5a694fe34f4f6f274d74b80
Comment 13 Maik Qualmann 2023-02-03 11:41:39 UTC
Git commit 1de09b8e35692495b54c8a1633bf06165800929c by Maik Qualmann.
Committed on 03/02/2023 at 11:41.
Pushed by mqualmann into branch 'master'.

fix similarity search results #2

M  +20   -6    core/libs/database/similaritydb/similaritydb.cpp

https://invent.kde.org/graphics/digikam/commit/1de09b8e35692495b54c8a1633bf06165800929c
Comment 14 Maik Qualmann 2023-02-03 20:02:49 UTC
Git commit 4e7d28e1098d1373b002e9167814056175d40f71 by Maik Qualmann.
Committed on 03/02/2023 at 20:02.
Pushed by mqualmann into branch 'master'.

fix similarity search results #3

M  +2    -2    core/libs/database/haar/haariface.cpp
M  +7    -2    core/libs/database/similaritydb/similaritydb.cpp
M  +6    -0    core/libs/database/similaritydb/similaritydb.h
M  +3    -3    core/utilities/fuzzysearch/findduplicatesview.cpp

https://invent.kde.org/graphics/digikam/commit/4e7d28e1098d1373b002e9167814056175d40f71
Comment 15 Maik Qualmann 2023-02-03 20:14:16 UTC
Git commit d366bbdc5f7f4bc0b05fb6a2f3948554a1798a43 by Maik Qualmann.
Committed on 03/02/2023 at 20:13.
Pushed by mqualmann into branch 'qt5-maintenance'.

backport fix to similarity search results #3
FIXED-IN: 7.10.0

M  +25   -6    core/libs/database/similaritydb/similaritydb.cpp
M  +6    -0    core/libs/database/similaritydb/similaritydb.h
M  +3    -3    core/utilities/fuzzysearch/findduplicatesview.cpp

https://invent.kde.org/graphics/digikam/commit/d366bbdc5f7f4bc0b05fb6a2f3948554a1798a43