Summary: | The results of the Duplicates are currently sorted by the image id and sortable by name and count if similars. But it should be sortable by the similarity of the duplicates, too. [patch] | ||
---|---|---|---|
Product: | [Applications] digikam | Reporter: | Mario Frank <mario.frank> |
Component: | Searches-Similarity | Assignee: | Digikam Developers <digikam-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | wishlist | CC: | caulier.gilles, mario.frank |
Priority: | NOR | ||
Version: | 5.3.0 | ||
Target Milestone: | --- | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | http://commits.kde.org/digikam/04c4024d4c7d3f03d91ed892286c2de78abeeb37 | Version Fixed In: | 5.4.0 |
Sentry Crash Report: | |||
Attachments: |
The first patch.
The second patch. |
Created attachment 102121 [details]
The second patch.
Mario, After to apply your patch from bug #369051, I cannot apply the file patch over source code : [gilles@localhost core]$ patch -p1 < DIGIKAM_DuplicatesSearch_ResultSet_ArithmeticOrder.patch patching file utilities/fuzzysearch/findduplicatesalbumitem.cpp Hunk #1 FAILED at 64. Hunk #2 FAILED at 79. Hunk #3 succeeded at 102 (offset -10 lines). Hunk #4 succeeded at 120 (offset -10 lines). 2 out of 4 hunks FAILED -- saving rejects to file utilities/fuzzysearch/findduplicatesalbumitem.cpp.rej patching file utilities/fuzzysearch/findduplicatesalbumitem.h Gilles Caulier ok forget my previous comment, i forget to apply patch 1 before patch 2... [gilles@localhost core]$ git reset --hard HEAD is now at a503172 update [gilles@localhost core]$ patch -p1 < DIGIKAM_DuplicatesSearch_ResultSet_AverageSimilarity.patch patching file libs/database/haar/haariface.cpp patching file libs/database/haar/haariface.h patching file libs/database/item/imagelister.cpp patching file libs/database/item/imagequerybuilder.cpp patching file utilities/fuzzysearch/findduplicatesalbum.cpp patching file utilities/fuzzysearch/findduplicatesalbumitem.cpp [gilles@localhost core]$ patch -p1 < DIGIKAM_DuplicatesSearch_ResultSet_ArithmeticOrder.patch patching file utilities/fuzzysearch/findduplicatesalbumitem.cpp patching file utilities/fuzzysearch/findduplicatesalbumitem.h [gilles@localhost core]$ Git commit 04c4024d4c7d3f03d91ed892286c2de78abeeb37 by Gilles Caulier. Committed on 10/11/2016 at 05:33. Pushed by cgilles into branch 'master'. Apply patches #102120 and #102121 from Mario Frank 102120: Extended the duplicates search list view. Now, the average similarity of the found duplicates (excluding the original image) is shown as table column. Sorting the result set by the average similarity is thus possible. To implement this feature, the haariface had to be modified. It returns a map of average similarities to a map of image ids to the set of similar images instead of the map of image ids to the set of similar images. Communicating the average similarity to the search list view was not possible via slots and signals and this would have lead to sending a map of image ids to average similarities and then distributing the appropriate average similarity to the correct FindDuplicateAlbumItem. Instead, the average similarity is communicated via the SearchXml-query as a field of the group. This way, the correct item gets the correct similarity automatically. The evaluation of the new field by an SQL query is surpressed by the introduction of noEffect fields which need to have a prefix "noeffect_". So, the log is not polluted by unnecessary debug information. 102121: The items in the FindDuplicatesAlbum were sorted by lexicographic order which does not make sense for the average similarity column (e.g. 100.00 is not correctly sorted). Thus, the less than operator was adopted such that for the average similarity column, arithmetic order is used. To make the code more stable against regressions due to reordering the columns, an enum was introduced. FIXED-IN: 5.4.0 CCMAIL: frank@uni-potsdam.de M +60 -30 libs/database/haar/haariface.cpp M +5 -5 libs/database/haar/haariface.h M +1 -1 libs/database/item/imagelister.cpp M +5 -1 libs/database/item/imagequerybuilder.cpp M +3 -2 utilities/fuzzysearch/findduplicatesalbum.cpp M +27 -4 utilities/fuzzysearch/findduplicatesalbumitem.cpp M +9 -0 utilities/fuzzysearch/findduplicatesalbumitem.h M +4 -4 utilities/fuzzysearch/findduplicatesview.cpp M +9 -6 utilities/fuzzysearch/fuzzysearchview.cpp http://commits.kde.org/digikam/04c4024d4c7d3f03d91ed892286c2de78abeeb37 |
Created attachment 102120 [details] The first patch. When searching for duplicates, the result set is a table with the thumbnail and the count of similar pictures (including the original one). It is possible to sort the rows by either the reference picture (either name or id, I'm not sure here) or the count of entries in this virtual album. Sadly, it is not possible to sort the result by similarity. This patch introduces this functionality. For each reference image, the average similarity (in percent) is calculated for the potential duplicates, excluding the reference image. This way, it is possible to sort the virtual albums by the average similarities of the duplicates in both ascending and descending order. There is still one glitch in the sorting. Since the sorting of the items is done by lexicographic order, the ordering of was not correct if the length of the average similarity string differs. This problem was fixed with another patch that introduces arithmetic ordering for this column explicitly. The second patch will be submitted as comment. The complete commit messages: " [PATCH] Extended the duplicates search list view. Now, the average similarity of the found duplicates (excluding the original image) is shown as table column. Sorting the result set by the average similarity is thus possible. To implement this feature, the haariface had to be modified. It returns a map of average similarities to a map of image ids to the set of similar images instead of the map of image ids to the set of similar images. Communicating the average similarity to the search list view was not possible via slots and signals and this would have lead to sending a map of image ids to average similarities and then distributing the appropriate average similarity to the correct FindDuplicateAlbumItem. Instead, the average similarity is communicated via the SearchXml-query as a field of the group. This way, the correct item gets the correct similarity automatically. The evaluation of the new field by an SQL query is surpressed by the introduction of noEffect fields which need to have a prefix "noeffect_". So, the log is not polluted by unnecessary debug information. " and " [PATCH] The items in the FindDuplicatesAlbum were sorted by lexicographic order which does not make sense for the average similarity column (e.g. 100.00 is not correctly sorted). Thus, the less than operator was adopted such that for the average similarity column, arithmetic order is used. To make the code more stable against regressions due to reordering the columns, an enum was introduced. "