416630 – Use N nearest neighbor search

Bug 416630 - Use N nearest neighbor search

Summary: Use N nearest neighbor search

Status:	RESOLVED FIXED

Alias:	None

Product:	digikam
Classification:	Applications
Component:	Faces-Recognition (show other bugs)
Version:	7.0.0
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Digikam Developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-01-23 02:51 UTC by Vitalii Tymchyshyn
Modified:	2020-09-03 21:01 UTC (History)
CC List:	5 users (show)

See Also:
Latest Commit:
Version Fixed In:	7.2.0
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Vitalii Tymchyshyn 2020-01-23 02:51:32 UTC

Currently recognition match is done using an average of all matches to a given person. This approach does not work well for people with high number of examples done at different age / hair color / angle. What makes things worse adding more examples usually makes matching worse as it means there are always distant examples outnumbering near ones. 

I tried nearest neighbor and it's pretty noise. What works best for me is to take average of N nearest examples (I tried 5 and 10). It eliminates noise yet finding great matches.

Another change that gave me good results is using adjusted cosine distance instead of regular one: each feature is normalized by it's mean across whole example database. E.g. if feature mean is high (e.g. 0.6), it does not have much effect on cosine as almost all vectors would be pointing into "positive" direction, while adjusted (n-0.6) will have vectors pointing into different sides providing meaningful input. It reduces overall similarly (I have to use 0.7 instead of 0.8-0.9 to find examples), but general quality seems better.

Note that I did not do a formal accuracy check for both changes yet. Tried to do it fast, but libraries produce "accuracy" that is low and rarely applicable as it does not take into account accuracy threshold (drop everything with match less than T)

Comment 1 Maik Qualmann 2020-01-23 06:39:10 UTC

Can you make your changes available as a patch?

Maik

Comment 2 Vitalii Tymchyshyn 2020-01-23 17:04:06 UTC

It would take time for me to get approvals, but I'll try.

Comment 3 caulier.gilles 2020-08-01 07:59:46 UTC

Vitalii,

We use Gitlab for the patch workflow now. Just fork the project, patch code, and make a Pull Request (PR) on the main stream to review changes.

https://invent.kde.org/graphics/digikam

Thanks in advance to contribute

Best Regards

Gilles Caulier

Comment 4 Minh Nghia Duong 2020-08-01 08:09:08 UTC

(In reply to caulier.gilles from comment #3)
Hi,

The K-Nearest search has been implemented and tested. It gives the best result so far.

Nghia

Comment 5 caulier.gilles 2020-08-01 09:16:03 UTC

Hi Nghia,

Of course, sorry for the noise. I read to fast this entry comments. So this file will be closed later GoSC 2020...

Gilles