Hi, Now in August 2020 the face detection has really improved a lot. The initiative kde took is really great! I would like to know if there is a plan to also add common object detection (not faces). For instance, water, tree, bike, car, mug, plate, food etc. etc. Here is an example of open CV proposing the Yolo Object Detection: https://docs.opencv.org/master/da/d9d/tutorial_dnn_yolo.html It is available in C++. So the idea would that a trained model scan each of the photo (in our library) and then propose multiple objects.
*** This bug has been marked as a duplicate of bug 416988 ***
Nghia, By curiosity did you already take a look to the OpenCV link given to the description of this file ? Gilles
(In reply to caulier.gilles from comment #2) Yes, I did. Actually, I tried it and it works wonderfully with the existing SSD and YOLO faces detection of faces engine. All we need to do is to download the corresponding files and add a little code to differentiate the pre-defined classes of the model. If you want I can implement it after the merge of GSoC.
Maik, Thanh, your viewpoint about Nghia proposal from comment #3 ? Best Gilles
Implementing DNN detection is simple, but we also need to define the use-cases and workflow for object detection. We can just choose an image and then return the image with bounding boxes and the name of the object like the example in the link above, but it's not really pratical for digikam, is it?
Hi Nghia, Very happy to hear that you work on that topic. Sounds so great, thanks for your hard work. As I am a Digikam User and a iphone user it would be great to have the following use case. 1. Each picture run to the yolo model and get assign 0, 1 or more than 1 object 2. If an object is wrong then the user can delete or update it (by update i mean chosing 1 of the many existing object of the yolo model). But it would be very annoying that i have to verify if each predicted object is correct or not!!! With my iphone the use case is the following. I take a picture of something lets say a sushis. Then several days later i want to see all picture of sushi i took So I go to the reseach bar and type 'sushi' then i see all the picture of sushi. Would be great to have this feature. In addition would be also great to have tags of each assigned object [like for people + manual tags], so there would be a category 'objects' with its subcat 'tree' 'sushi' etc then i could simply click on sushi to see all picture of sushis
Created attachment 131311 [details] attachment-25818-0.html Hi, As @markd said, it may be useful for users who want to search for images relating to 'sushi' or some specific objects, but in my opinion, the scope of this project needs to be reviewed carefully. Since YOLO is designed for object detection in general, there will be plenty of results for some trivial objects such as: table, spoon, banana, etc. Moreover, I've seen many cases where objects detected by YOLO are in the corners or not clearly visible. So, maybe an image tagged with sushi but it's far away from the view. Moreover, for specific objects (like sushi, plants, monuments, etc.) I suppose we need a YOLO version trained on specific datasets for those objects (or users may train the network themselves). Therefore, we really need to define clearly the objects that we aim to include in digikam for object detection. So the project is really interesting, but I would propose to create a poll from digikam users to get an idea on what object detection we want to support. Otherwise, a more extensible way but requiring some work from users is to design code templates for object detection (extending from facesengine). Then, users only need to train and provide the weights for the network to run the detection on their own. Best, Trung On Mon, Aug 31, 2020 at 12:43 AM markd <bugzilla_noreply@kde.org> wrote: > https://bugs.kde.org/show_bug.cgi?id=426003 > > --- Comment #6 from markd <citbparpmakajjecpg@kiabws.online> --- > Hi Nghia, > > Very happy to hear that you work on that topic. Sounds so great, thanks for > your hard work. > > As I am a Digikam User and a iphone user it would be great to have the > following use case. > > 1. Each picture run to the yolo model and get assign 0, 1 or more than 1 > object > > 2. If an object is wrong then the user can delete or update it (by update i > mean chosing 1 of the many existing object of the yolo model). But it > would be > very annoying that i have to verify if each predicted object is correct or > not!!! > > > With my iphone the use case is the following. > > I take a picture of something lets say a sushis. > > Then several days later i want to see all picture of sushi i took > So I go to the reseach bar and type 'sushi' then i see all the picture of > sushi. > > Would be great to have this feature. > > > In addition would be also great to have tags of each assigned object [like > for > people + manual tags], so there would be a category 'objects' with its > subcat > 'tree' 'sushi' etc then i could simply click on sushi to see all picture > of > sushis > > -- > You are receiving this mail because: > You are on the CC list for the bug.
Hi Markd, As Trung said, for now, both the pre-trained YOLO and SSD models can recognize basic objects. In order to apply recognition on a specific set of objects, users might need to find a specific pre-trained model or to train a model that works for them. However, we can prepare a module for general object recognition that can be compatible with YOLO and SSD. User can then install their pre-trained model with some basic configuration that adapts to their usages. Nghia
See the advancement of the student project about AI based auto-tags (mostly completed) : https://community.kde.org/GSoc/2023/StatusReports/QuocHungTran# Gilles Caulier
Hi, With next digiKam 8.3.0 release, the auto-tags assignment feature have been implemented without using a cloud service. The processing is done in core application with delegate neural network models stored in computer. For more details about auto-tags assignment feature, look on student work report : https://community.kde.org/GSoc/2023/StatusReports/QuocHungTran#Add_Automatic_Tags_Assignment_Tools_and_Improve_Face_Recognition_Engine_for_digiKam Best regards Gilles Caulier