Bug 416988 - Objects / Forms / Monuments / Context detection and recognition using Deep Learning
Summary: Objects / Forms / Monuments / Context detection and recognition using Deep Le...
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Tags-AutoAssignement (show other bugs)
Version: 7.0.0
Platform: unspecified All
: NOR wishlist
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-31 15:44 UTC by Daniel
Modified: 2023-12-01 04:28 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed In: 8.3.0
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel 2020-01-31 15:44:07 UTC
SUMMARY
deep learning has become quite good at recognizing images or detecting objects in images. Also, computers have become more powerful in the last years which make such applications more applicable in practice.

Just as an example: http://imageai.org/#features

This could enhance Digital Asset Managment and would make it easier to find fotos in very large collections.

I suppose such system should be implemented with a plugin.
Comment 1 caulier.gilles 2020-01-31 15:57:07 UTC
Hi,

Seriously, you don't found the "People" tab from right sidebar ???

Look also this blog post for release 7.0.0-beta1...

https://www.digikam.org/news/2019-12-22-7.0.0-beta1_release_announcement/

In other words, it's already implemented and ready to test :

https://files.kde.org/digikam/

Best

Gilles Caulier
Comment 2 caulier.gilles 2020-01-31 15:59:56 UTC
Ok, the file title is not enough explicit... It's about objects and forms detection recognition, not only faces...

Sorry for the noise

Gilles Caulier
Comment 3 Daniel 2020-01-31 16:04:37 UTC
(In reply to caulier.gilles from comment #1)
> Hi,
> 
> Seriously, you don't found the "People" tab from right sidebar ???
> 
> Look also this blog post for release 7.0.0-beta1...
> 
> https://www.digikam.org/news/2019-12-22-7.0.0-beta1_release_announcement/
> 
> In other words, it's already implemented and ready to test :
> 
> https://files.kde.org/digikam/
> 
> Best
> 
> Gilles Caulier

Hey Gilles,

Yes, yes, yes I already know that (I reported bug #415782), but this is not what I meant here: What I meant is the following:

Assigning tags/description/metadata based on what is recognized in the image: a car/a tree/a table/a kite/whatever without having it trained on your own images. 

-- Daniel
Comment 4 caulier.gilles 2020-05-26 14:24:58 UTC
Hi all,

I found this project which have been already ported as a Darktable plugin :

https://github.com/scheckmedia/photils-dt

Photils tool analysis your image in local with a Neural Network and generate a data vector send to a remote web service. You image is not send on the web.

The web service return a list of tags as strings which can be used to populate the database...

I can create a Photils plugin version for digiKam as a 3rdparty tool. I'm in contact with the Darktable plugin author.

Any comments are welcome.

Gilles Caulier
Comment 5 caulier.gilles 2020-05-26 15:01:34 UTC
Nghia,

Just look my previous comment #4...

Gilles
Comment 6 Minh Nghia Duong 2020-05-26 15:13:57 UTC
(In reply to caulier.gilles from comment #5)
> Nghia,
> 
> Just look my previous comment #4...
> 
> Gilles

Hello Gilles,

It's a very interesting feature. What is the name of the model used in the data vector generator?

The context extraction from photo might be feasible, but at first, we need to scale the implementation of YOLO detection to accelerate the speed of processing or change to another version of SSD-MobileNet. Because the current version of SSD-Mobile used in digikam is only for face detection and it doesn't work really well. 

Furthermore, context extraction might envoke a Recurrent neural network, if such a pre-trained model exists for downloading, we can do it.

Nghia
Comment 7 caulier.gilles 2020-05-26 15:19:43 UTC
Nghia,

I don't know yet which model is used. You can ask directly to the developer on Linkedin:

https://www.linkedin.com/in/tobiasscheck/

I'm talking with it by this way...

Gilles
Comment 8 Maik Qualmann 2020-08-30 19:41:30 UTC
*** Bug 426003 has been marked as a duplicate of this bug. ***
Comment 9 caulier.gilles 2023-02-23 22:13:28 UTC
Another source code written in Python to parse collection for object detection and tags image in database accordingly.

https://github.com/oliveox/digikam-object-detection-plugin/tree/master/src

It's based on Yolo2 model.

Gilles Caulier
Comment 10 caulier.gilles 2023-10-26 05:34:17 UTC
See the advancement of the student project about AI based auto-tags (mostly completed) :

https://community.kde.org/GSoc/2023/StatusReports/QuocHungTran#

Gilles Caulier
Comment 11 caulier.gilles 2023-12-01 04:28:50 UTC
Hi,

With next digiKam 8.3.0 release, the auto-tags assignment feature have been
implemented without using a cloud service. The processing is done in
core application with delegate neural network models stored in computer.

For more details about auto-tags assignment feature, look on student work
report :

https://community.kde.org/GSoc/2023/StatusReports/QuocHungTran#Add_Automatic_Tags_Assignment_Tools_and_Improve_Face_Recognition_Engine_for_digiKam

Best regards
Gilles Caulier