Bug 435863 - digikam "looses" manually tagged faces
Summary: digikam "looses" manually tagged faces
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Faces-Workflow (show other bugs)
Version: 7.2.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-18 04:57 UTC by hpagend
Modified: 2022-12-25 18:27 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In: 8.0.0
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description hpagend 2021-04-18 04:57:07 UTC
SUMMARY
I own a large collection of images managed with digikam 6. It contain faces which I had tagged manually and assigned a name. After some small scale tests I (unfortunately) ran digikam 7.2 face detection on it. Now, some of the persons are not tagged anymore in some of the images. If a person did not survive the conversion and is not present in other parts of the collection not yet face detection processed, this person's name is even missing in the list of person-tags.

STEPS TO REPRODUCE
I could verify this, using a pre-digikam7 copy of my collection. (this is basically my backup-computer with the whole collection rsynced before I started to try out digikam 7. Where digikam6 on the backup still shows a facemark in the image, the facemark is gone after digikam7 ran its face detection. The phenomenon exist for a large part of the images.

I did not yet try to set up a small scale demontrator for this. I am busy to restore the broken part of my collection.
1. 
2. 
3. 

OBSERVED RESULT

previously marked face tags disappear when switching from digikam6 (Ubuntu-repository version) to digikam 7.2 appimage.

Also suspected: Previously marked face don't contribute to face matching. This is simply concluded from the extremely poor results of  Face matching where faces are not freshly put in (YOLO V3 method used)

Also observed: When displaying one person, the images are not anymore separated by the directory, in which the occure


EXPECTED RESULT

Existing face marks and person tags should never disappear without user interaction.
suggestion: When converting old person-tags to new style face-tags, create a tag hirarchy person-in-image, to conserve the information previously added to an image. Thus at least the name of the person and the fact that he/she is present in the image survives. Users will still be able to remove these tags from the keywords list later once face detection is satisfyingly completed.

Persons should be listed separated by their occurrence in directories, as they used to do in digikam 6. This greatly improves to locate images "near" the images showing the known person.

I hope, i get the wording understandable, since my  digikam7 display a funny mixture of german and english locale.

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
I am running ubuntu 20.4 (gnome) with the digikam7.2 appimage.
Comment 1 Maik Qualmann 2021-04-18 05:40:53 UTC
I don't think digiKam is losing faces. There is no conversion of the face rectangles between older versions and the current version. The manually created faces are also not deleted during face detection. Such a problem would have been reported long ago.
If you can't find a person in the people view, look in the normal tags tree. Mark the missing person as a face tag with the context menu of the right mouse button. The marking as a face tag is required in current digiKam versions and was not carried out automatically in older versions.

In order to get better results with face recognition, the training database in the maintenance tool has to be rebuilt. A special album selection is not necessary, all available confirmed faces are used. It is known that face recognition works worse with manually created faces.

Sorting by face names instead of directories in People View was a desire of many users.

The problem with the German translation is known, this will only improve in the next versions. By adding missing context information for the translator, the existing translation is partly invalid.

Maik
Comment 2 hpagend 2021-04-18 07:50:33 UTC
Thank you Maik for this immediate response. It may indeed be that the people info survived in the "normal" tag tree, even though I seem to remember to have visited that. However my memory is vague at that point. Unfortunately, I cannot check for it right now, as I have restored the previous state of database and images from backup. I will try to build a test case soon and come back to this in case I can reproduce my claimed behaviour. As for now, I assume, you know better than I.

I guess I should come up with a strategy to get the most out of the face recognition. It would be extremely valuable. Guess, simply throwing it on 3 TB of images was naive. 

As for the sorting issue: I probably explained it not precisely enough. If in the people view a single person is selected, all images of that person used to be shown but with separators like headbars including the directory name (in my case the directory name is the date of the day the image was taken. This made it easy to open that day and browse the neighboring images. This headbar is gone in the new version. It is obviously a minor issue and if it is that, users voted away i can easily live with it. I simply found the old behaviour helpful.

The language mix isn't a problem for me. Good to know, it has already been noticed. 

Again, thanks a lot.

Schorsch
Comment 3 Maik Qualmann 2021-04-18 08:51:53 UTC
After calling up the People view, you can switch the item separation to album, then the faces are again grouped by album. However, this setting is not saved. I think we should implement a separate setting for the item separation between the Album / Tag / Date / Searches view and the People view.

Maik
Comment 4 hpagend 2021-04-18 09:59:45 UTC
Thanks again, Maik, for the reply. Even though I am working with digikam for ages, my knowledge about it is somewhat limited to the few things I always use. From your answers I discovered already two things new to me. How to activate the separators plus there exists a maintainance tool, which I obviously never needed to use. It seems digikam simply worked for me like charm in the past. Probably it will continue so in the future if only the bug wouldn't sit in front of the keyboard :-)
Comment 5 Maik Qualmann 2021-04-18 10:30:15 UTC
Git commit 7992f4a0e62470f020e1a71cc5046c74a5ff80f3 by Maik Qualmann.
Committed on 18/04/2021 at 10:28.
Pushed by mqualmann into branch 'master'.

update the current item separation mode in the menu action

M  +6    -2    core/app/items/views/digikamitemview.cpp
M  +1    -0    core/app/items/views/digikamitemview.h
M  +9    -0    core/app/main/digikamapp_setup.cpp
M  +5    -1    core/app/views/stack/itemiconview.cpp
M  +1    -0    core/app/views/stack/itemiconview.h

https://invent.kde.org/graphics/digikam/commit/7992f4a0e62470f020e1a71cc5046c74a5ff80f3
Comment 6 caulier.gilles 2022-01-09 15:24:38 UTC
Hi and happy new year,

Please give us a fresh feedback using current digiKam 7.5.0 pre-release bundle available here :

https://files.kde.org/digikam/

It will includes last changes from Maik listed in this file.

Thanks in advance

Gilles Caulier
Comment 7 hpagend 2022-01-10 17:31:45 UTC
Dear Gilles, dear all,

meanwhile I have successfully transited my 6TB image archive from digikam 6 to version 7. I am not on 7.5 yet. Currently running 7.3.0 app-image under ubuntu. It may still be useful to some, if I provide some feedback here, especially as I had hit a pothole on my first attempt and I can also mention a few frustrating experiences on this second succesfull try.
But first of all I want to make clear that I am very happy with what I have running now. Digikam for me is the workhorse I am using all day. Once the transition of the vast amount of faces to the new face recognition is done, everything is just fine. The transition is a one time effort and probably I did not have the optimal strategy for that. But I succeeded.
Now, where did I encounter the traps? To play it safe, I tested on a second computer rather than taking the risk to interrupt my daily work. The tests with a subset of data worked nicely but I learned that the detection of faces uses a huge amount of computing. As I was close to upgrade my system anyway, I decided to buy a new computer first. What I finally use is an I7 8-core with 32 GB RAM, 500 GB SSD and 12 TB hard disk. Images reside on the harddisk, digikamdb on the SSD. It took me more than a week to re-scan all the faces. As many of my images contained pre-tagged faces originating from picasa or manually tagged within digikam 6, I wanted  to start with a clean data base and expected digikam 7 to learn from the tags within the image files. 
I would load chunks of images for the computer to stay busy detecting faces over night and next morning I would assign names where needed, confirm digikam's suggestion or correct them in order for the AI to learn the faces. 
For the first two or three days that would work fine. However, digikam is simply "too good" in finding face areas. This leads to an enormous amount of unknown blurry faces in the "unknown" Folder. There is a need to remove those by marking "ignore" because with large amount of data the user has no chance to get an overview. It would be very useful if there was a parameter to specify the sensitivity with respect to a minimum amount of pixel a face should have and with respect to what is an acceptable blurriness. That simply to reduce the amount of people found somewhere in the background.
The second trap is somehow related to this. After a while, the face recognition started to produce more and more bogus-results and it took me a long time to come up with an idea why this happens. I found a valuable hint somewehre else in the forum. According to that post, digicam only uses a limited number of samples of a person's face to compare with an unknown face. In my case with many pre-tagged faces this leads to digikam un-learn faces. I would like to illustrate this with an example: Every month I shoot a Jazzband performing. The person in the foreground was usually pre-tagged. So when the image is scanned the dominant high resolution foreground faces are immediately added to the data base. However many images will have the drummer in the background somewhat small and out-of-focus. Digikam will surely find all these faces and present them for confirmation. For the photographer it is easy to recognize the drummer, so click "confirm" or assign his name, if he shows up under a wrong name. However this means that you load all the blurry instances of the given person and after you add enough, digikam will not use the sharp ones for identification any more. Boom! It was very frustrating until I changed my strategy to simply ignore all blurry images even when I could easily say who it is. I hope this explanation is somewhat correct and helps others to avoid to repeat my mistake.
What I am missing is a mechanism that would sort or group unknown faces by similarity. I believe that was something picasa did well. With such a grouping mechanism it would be possible to keep a large amount of faces in the unknown folder rather than having to mark them "not-a-face". This way the software would tell, that you came across the same face year after year but never bothered to put a name to the person. A nice incentive to do a little research to find out, who this person is. As I shoot a lot of open air events, this turned out very valuable in the past. 
I understand that the use-case I illustrated here, is not what digikam was designed for. and I repeat, once you have worked your way through this initial transition, everything works just fine since you will typically add new faces with your new images.
I would like to apologize, if I missed items which are actually already there while I was too ignorant to find it. I am just a naive user, who tries to get things done without too much knowledge about the software. As I said before: probably the bug sits on the chair in front of the keyboard in my case.