416988 – Objects / Forms / Monuments / Context detection and recognition using Deep Learning

Bug 416988 - Objects / Forms / Monuments / Context detection and recognition using Deep Learning

Summary: Objects / Forms / Monuments / Context detection and recognition using Deep Le...

Status:	RESOLVED FIXED

Alias:	None

Product:	digikam
Classification:	Applications
Component:	Tags-AutoAssignement (other bugs)
Version First Reported In:	7.0.0
Platform:	unspecified All

Importance:	NOR wishlist
Target Milestone:	---
Assignee:	Digikam Developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-01-31 15:44 UTC by Daniel
Modified:	2023-12-01 04:28 UTC (History)
CC List:	7 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:	8.3.0
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Daniel 2020-01-31 15:44:07 UTC

SUMMARY
deep learning has become quite good at recognizing images or detecting objects in images. Also, computers have become more powerful in the last years which make such applications more applicable in practice.

Just as an example: http://imageai.org/#features

This could enhance Digital Asset Managment and would make it easier to find fotos in very large collections.

I suppose such system should be implemented with a plugin.

Comment 1 caulier.gilles 2020-01-31 15:57:07 UTC

Hi,

Seriously, you don't found the "People" tab from right sidebar ???

Look also this blog post for release 7.0.0-beta1...

https://www.digikam.org/news/2019-12-22-7.0.0-beta1_release_announcement/

In other words, it's already implemented and ready to test :

https://files.kde.org/digikam/

Best

Gilles Caulier

Comment 2 caulier.gilles 2020-01-31 15:59:56 UTC

Ok, the file title is not enough explicit... It's about objects and forms detection recognition, not only faces...

Sorry for the noise

Gilles Caulier

Comment 3 Daniel 2020-01-31 16:04:37 UTC

(In reply to caulier.gilles from comment #1)
> Hi,
> 
> Seriously, you don't found the "People" tab from right sidebar ???
> 
> Look also this blog post for release 7.0.0-beta1...
> 
> https://www.digikam.org/news/2019-12-22-7.0.0-beta1_release_announcement/
> 
> In other words, it's already implemented and ready to test :
> 
> https://files.kde.org/digikam/
> 
> Best
> 
> Gilles Caulier

Hey Gilles,

Yes, yes, yes I already know that (I reported bug #415782), but this is not what I meant here: What I meant is the following:

Assigning tags/description/metadata based on what is recognized in the image: a car/a tree/a table/a kite/whatever without having it trained on your own images. 

-- Daniel

Comment 4 caulier.gilles 2020-05-26 14:24:58 UTC

Hi all,

I found this project which have been already ported as a Darktable plugin :

https://github.com/scheckmedia/photils-dt

Photils tool analysis your image in local with a Neural Network and generate a data vector send to a remote web service. You image is not send on the web.

The web service return a list of tags as strings which can be used to populate the database...

I can create a Photils plugin version for digiKam as a 3rdparty tool. I'm in contact with the Darktable plugin author.

Any comments are welcome.

Gilles Caulier

Comment 5 caulier.gilles 2020-05-26 15:01:34 UTC

Nghia,

Just look my previous comment #4...

Gilles

Comment 6 Minh Nghia Duong 2020-05-26 15:13:57 UTC

(In reply to caulier.gilles from comment #5)
> Nghia,
> 
> Just look my previous comment #4...
> 
> Gilles

Hello Gilles,

It's a very interesting feature. What is the name of the model used in the data vector generator?

The context extraction from photo might be feasible, but at first, we need to scale the implementation of YOLO detection to accelerate the speed of processing or change to another version of SSD-MobileNet. Because the current version of SSD-Mobile used in digikam is only for face detection and it doesn't work really well. 

Furthermore, context extraction might envoke a Recurrent neural network, if such a pre-trained model exists for downloading, we can do it.

Nghia

Comment 7 caulier.gilles 2020-05-26 15:19:43 UTC

Nghia,

I don't know yet which model is used. You can ask directly to the developer on Linkedin:

https://www.linkedin.com/in/tobiasscheck/

I'm talking with it by this way...

Gilles

Comment 8 Maik Qualmann 2020-08-30 19:41:30 UTC

*** Bug 426003 has been marked as a duplicate of this bug. ***

Comment 9 caulier.gilles 2023-02-23 22:13:28 UTC

Another source code written in Python to parse collection for object detection and tags image in database accordingly.

https://github.com/oliveox/digikam-object-detection-plugin/tree/master/src

It's based on Yolo2 model.

Gilles Caulier

Comment 10 caulier.gilles 2023-10-26 05:34:17 UTC

See the advancement of the student project about AI based auto-tags (mostly completed) :

https://community.kde.org/GSoc/2023/StatusReports/QuocHungTran#

Gilles Caulier

Comment 11 caulier.gilles 2023-12-01 04:28:50 UTC

Hi,

With next digiKam 8.3.0 release, the auto-tags assignment feature have been
implemented without using a cloud service. The processing is done in
core application with delegate neural network models stored in computer.

For more details about auto-tags assignment feature, look on student work
report :

https://community.kde.org/GSoc/2023/StatusReports/QuocHungTran#Add_Automatic_Tags_Assignment_Tools_and_Improve_Face_Recognition_Engine_for_digiKam

Best regards
Gilles Caulier