Bug 365202 - External MySQL database faces update segfault [patch]
Summary: External MySQL database faces update segfault [patch]
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Database-Faces (show other bugs)
Version: 5.0.0
Platform: Other Linux
: NOR crash
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-07 15:21 UTC by Evert Vorster
Modified: 2020-08-26 03:46 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.1.0
Sentry Crash Report:


Attachments
Backtrace of the fault (15.99 KB, text/plain)
2016-07-07 15:23 UTC, Evert Vorster
Details
A different segfault on updating the faces database (33.65 KB, text/plain)
2016-07-07 15:26 UTC, Evert Vorster
Details
Segfault with full debugging symbols (18.69 KB, text/plain)
2016-07-08 01:02 UTC, Evert Vorster
Details
full debug, only faces detect (25.19 KB, text/plain)
2016-07-08 01:07 UTC, Evert Vorster
Details
segfault just updating the faces database (32.60 KB, text/plain)
2016-07-08 01:13 UTC, Evert Vorster
Details
Valgrind log of a segfault (191.59 KB, text/plain)
2016-07-08 15:35 UTC, Evert Vorster
Details
Konsole output around the error. (3.65 KB, text/plain)
2016-07-08 15:36 UTC, Evert Vorster
Details
add mutex to prevent non re-entrency in OpenCV API (980 bytes, patch)
2016-07-09 20:16 UTC, caulier.gilles
Details
Segfault, with the patch applied on detect faces only (14.48 KB, text/plain)
2016-07-10 05:18 UTC, Evert Vorster
Details
valgrind with mysql gone away error (213.02 KB, text/plain)
2016-07-10 06:28 UTC, Evert Vorster
Details
valgrind log (202.17 KB, text/plain)
2016-07-10 09:02 UTC, Evert Vorster
Details
Console output of crash after removing movie files. (10.00 KB, text/plain)
2016-07-10 09:54 UTC, Evert Vorster
Details
facedetector.patch (469 bytes, patch)
2016-07-10 10:53 UTC, Maik Qualmann
Details
gdb crash log with facedetect patch loaded. (28.79 KB, text/plain)
2016-07-10 13:13 UTC, Evert Vorster
Details
Segfault with facedetect patch applied, running on SQlite db (17.76 KB, text/plain)
2016-07-10 15:30 UTC, Evert Vorster
Details
The list of files from OpenCV (24.03 KB, text/plain)
2016-07-10 17:22 UTC, Evert Vorster
Details
patch to use face module from opencv_contrib instead from digiKam core (3.52 KB, patch)
2016-07-10 19:07 UTC, caulier.gilles
Details
face31.patch (55.87 KB, patch)
2016-07-10 20:28 UTC, Maik Qualmann
Details
Compile error Gilles patch to use external face module (3.68 KB, text/plain)
2016-07-11 09:56 UTC, Evert Vorster
Details
currentFace31.patch (27.05 KB, patch)
2016-07-12 16:30 UTC, Maik Qualmann
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Evert Vorster 2016-07-07 15:21:10 UTC
This looks very similar to bug 262596, but that bug is from 2011, and says it's fixed, and I am getting segfaults with a compiled from sources from git of today. 

I was trying out digikam's faces detect on the git master from a couple days back, and noticed that the application is crash-happy when detecting/tagging faces. So, I pulled from git today, and retried the operation. 

So, to test, I just opened digikam, pointed it at a directory of photos and let the face detect do it's thing. 


Reproducible: Always

Steps to Reproduce:
1. Open digikam
2. Select "faces" icon
3. Select scan database for face. 
4. select subset of photo database
5. Set it to detect faces only, not try to recognize them. 
6. Let it run... 


Actual Results:  
The program segfaults after running for a few minutes. 

Expected Results:  
All the faces to be detected without a segfault. 

I am using three MariaDB databases for DigiKam, rather than the internal sqlite.
Comment 1 Evert Vorster 2016-07-07 15:23:46 UTC
Created attachment 99925 [details]
Backtrace of the fault
Comment 2 Evert Vorster 2016-07-07 15:26:23 UTC
Created attachment 99926 [details]
A different segfault on updating the faces database

I have a feeling that the faces detect, and updating the faces database may be related. 
With this log I was just sorting through the unknown faces, with no face detect running.
Comment 3 caulier.gilles 2016-07-07 15:35:12 UTC
Which OpenCV version you use ? Look in Components Info dialog for details.

Did you use multicore CPU option to detect face ?

Gilles Caulier
Comment 4 caulier.gilles 2016-07-07 15:39:22 UTC
The backtrace do not include debug symbols. Please recompile whole digiKam source code with debug, using cmake option "-DCMAKE_BUILD_TYPE=debug"

Gilles Caulier
Comment 5 Evert Vorster 2016-07-07 15:41:29 UTC
Rebuiding takes a while. 
In the meantime, I found the option for using multiple CPU, and on my machine it is turned off. 

I'll go recompile with debug options.
Comment 6 Evert Vorster 2016-07-07 18:15:34 UTC
rebuilding the debug version is unfortunately taking longer than expected. It ate up all my space on a 4gb /tmp... and it took two tries to figure that out. 

It's building now, on a much bigger drive. 

However, I can reliably crash digikam. It "feels" like it's a race condition. Paradoxially, when my system is really busy and slow to respond it's much more difficult to crash digikam. 

I'll have a proper report in the morning. ;)
Comment 7 Evert Vorster 2016-07-08 01:02:59 UTC
Created attachment 99938 [details]
Segfault with full debugging symbols

Finally managed to get this full debud build. 
I'll add a few more crash logs...
Comment 8 Evert Vorster 2016-07-08 01:07:14 UTC
Created attachment 99939 [details]
full debug, only faces detect
Comment 9 Evert Vorster 2016-07-08 01:13:38 UTC
Created attachment 99940 [details]
segfault just updating the faces database
Comment 10 caulier.gilles 2016-07-08 04:41:51 UTC
So you use Mysql as database. Right ?

and which Opencv library version you use ? 2.x or 3.x ?

Gilles Caulier
Comment 11 Evert Vorster 2016-07-08 05:18:26 UTC
The database is MariaDB, which is the same as MySQL, I believe. 

I use OpenCV 3.x
Comment 12 caulier.gilles 2016-07-08 08:37:18 UTC
If the crash is a race condition between the FaceManagement code and the database interface, perhaps valgrind can help to identify when memory is corrupted. Typically it's in face histogram computation thread.

Note : Face detection do not crash using Sqlite database here. I process 20K images in 10 minutes on my main linux computer.

Gilles Caulier
Comment 13 Evert Vorster 2016-07-08 15:35:16 UTC
Created attachment 99946 [details]
Valgrind log of a segfault

Finally, a Valgrind log of the error
Comment 14 Evert Vorster 2016-07-08 15:36:35 UTC
Created attachment 99947 [details]
Konsole output around the error.

Noticed this error about halfway down the attached file. For some reason MySQL was "not available"
Comment 15 caulier.gilles 2016-07-08 15:38:41 UTC
Sound like the source of the problem :

==16753== Conditional jump or move depends on uninitialised value(s)
==16753==    at 0x973DFAA: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x9744026: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0xC22C56C: cv::parallel_for_(cv::Range const&, cv::ParallelLoopBody const&, double) (in /usr/lib/libopencv_core.so.3.1.0)
==16753==    by 0x9745AE0: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x9750104: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x973AE2D: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x974EF74: cv::CascadeClassifier::detectMultiScale(cv::_InputArray const&, std::vector<cv::Rect_<int>, std::allocator<cv::Rect_<int> > >&, double, int, int, cv::Size_<int>, cv::Size_<int>) (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x5000F2D: FacesEngine::OpenCVFaceDetector::cascadeResult(cv::Mat const&, FacesEngine::Cascade&, FacesEngine::DetectObjectParameters const&) const (opencvfacedetector.cpp:469)
==16753==    by 0x5001F5A: FacesEngine::OpenCVFaceDetector::verifyFace(cv::Mat const&, QRect const&) const (opencvfacedetector.cpp:536)
==16753==    by 0x500360C: FacesEngine::OpenCVFaceDetector::detectFaces(cv::Mat const&, cv::Size_<int> const&) (opencvfacedetector.cpp:767)
==16753==    by 0x5019A31: FacesEngine::FaceDetector::detectFaces(QImage const&, QSize const&) (facedetector.cpp:160)
==16753==    by 0x5324A5A: Digikam::DetectionWorker::process(QExplicitlySharedDataPointer<Digikam::FacePipelineExtendedPackage>) (facepipeline.cpp:483)
==16753==  Uninitialised value was created by a heap allocation
==16753==    at 0x4C2ABD0: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16753==    by 0xC079B21: cv::fastMalloc(unsigned long) (in /usr/lib/libopencv_core.so.3.1.0)
==16753==    by 0xC1CD3D0: cv::Mat::create(int, int const*, int) (in /usr/lib/libopencv_core.so.3.1.0)
==16753==    by 0x974BB2E: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x9745609: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x9750104: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x973AE2D: ??? (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x974EF74: cv::CascadeClassifier::detectMultiScale(cv::_InputArray const&, std::vector<cv::Rect_<int>, std::allocator<cv::Rect_<int> > >&, double, int, int, cv::Size_<int>, cv::Size_<int>) (in /usr/lib/libopencv_objdetect.so.3.1.0)
==16753==    by 0x5000F2D: FacesEngine::OpenCVFaceDetector::cascadeResult(cv::Mat const&, FacesEngine::Cascade&, FacesEngine::DetectObjectParameters const&) const (opencvfacedetector.cpp:469)
==16753==    by 0x5001F5A: FacesEngine::OpenCVFaceDetector::verifyFace(cv::Mat const&, QRect const&) const (opencvfacedetector.cpp:536)
==16753==    by 0x500360C: FacesEngine::OpenCVFaceDetector::detectFaces(cv::Mat const&, cv::Size_<int> const&) (opencvfacedetector.cpp:767)
==16753==    by 0x5019A31: FacesEngine::FaceDetector::detectFaces(QImage const&, QSize const&) (facedetector.cpp:160)
Comment 16 Evert Vorster 2016-07-08 15:43:06 UTC
I'm sorry, I can't read that log as clearly as you do. I know how to produce them, and then hopefully someone smarter than me can pick at it and hopefully update the git with a fix. 

Can you put it in layman's terms, please? What happens next?
Comment 17 caulier.gilles 2016-07-09 20:16:06 UTC
Created attachment 99970 [details]
add mutex to prevent non re-entrency in OpenCV API

Try with my little patch if it fix the problem.
Gilles Caulier
Comment 18 Evert Vorster 2016-07-10 03:47:06 UTC
Hey there... 
Unfortunately, that patch did not stop the crashing. 
With just scanning for faces, it got about 6% through my collection of 30,000 photos ( 160Gb ) before crashing again. This time it's not a segfault, but the program halts with this weird error:
digikam.dbengine: Prepare failed!
digikam.dbengine: Failure executing query:
 "SELECT orientation FROM ImagaInformation WHERE imageid=?;"
Error messages: "QMYSQL3: Unable to prepare statement" "MySQL server has gone away" 2006 2
Bound values: ()

One thing worth noticing though, is that I also tried to scan and recognize faces, and it zipped through that entire collection in seconds, and did not find a single face. 

Just recognizing faces on their own seems to work fine, however. 

I'll start building digikam with debugging enabled, and see if I can track this one down.,
Comment 19 Evert Vorster 2016-07-10 05:18:23 UTC
Created attachment 99977 [details]
Segfault, with the patch applied on detect faces only

Once I rebuilt digikam with debugging enabled, I ran it through gdb with detecting faces only. 
It failed fairly quickly with a segfault, and I have attached the log here. 

Right now I am running the same through Valgrind, but it might be a while before it crashes. 

-Evert-
Comment 20 Evert Vorster 2016-07-10 06:28:37 UTC
Created attachment 99978 [details]
valgrind with mysql gone away error

There seems to be two distict modes of failure here. One is the "mysql gone away" error, and the other is a segfault. 
I just attached a valgrind for the mysql gone away error, I'll re-run and see if I am "lucky" enough to catch the program segfaulting in valgrind.
Comment 21 Maik Qualmann 2016-07-10 06:34:52 UTC
> digikam.dbengine: Failure executing query:
>  "SELECT orientation FROM ImagaInformation WHERE imageid=?;"

Have you copied this string or it may contain a typo?

ImagaInformation => ImageInformation
The query string is correct in digKam source code.

Maik
Comment 22 Evert Vorster 2016-07-10 06:37:15 UTC
Re: Maik Qualmann, 
I did indeed make a typo. 

Thanks for pointing that out. 
-Evert-
Comment 23 Evert Vorster 2016-07-10 06:39:10 UTC
Ah, Maik, the same error is repeated in the valgrind log that I just attached, and since I did not type that one, it definitely does not contain typos. 

Thanks for looking into this!
-Evert-
Comment 24 Maik Qualmann 2016-07-10 08:29:42 UTC
I see in the valgrind log that QuickTime video is included. I think it's possible that Exiv2 crashes here. Can you these video files for test removed in face recognition?

Maik
Comment 25 Evert Vorster 2016-07-10 09:02:05 UTC
Created attachment 99981 [details]
valgrind log

OK, Thanks for looking into this. 
I have moved all the .mov and .mp4 from that directory. 

There are still some cr2, png, and tiff mixed in, but exiv should not have a problem with those. 
Right now digikam is running through my collection, and let's see if it crashes.
Comment 26 Evert Vorster 2016-07-10 09:54:58 UTC
Created attachment 99984 [details]
Console output of crash after removing movie files.

So, I have removed all the movie files from the directory, and let the face scanner do it's thing. 
Now I'll try again through gdb, and also remove anything that is not a .jpg, just to be sure.
Comment 27 Maik Qualmann 2016-07-10 10:26:36 UTC
Can you check if these images are broken?

/data/DigiKam/Photos/Home/2015/02-28 Oval Track/Evert Pictures/IMG_7377.JPG
/data/DigiKam/Photos/Home/2015/02-28 Oval Track/Evert Pictures/IMG_7378.JPG
/data/DigiKam/Photos/Home/2015/02-28 Oval Track/Evert Pictures/IMG_7381.JPG
/data/DigiKam/Photos/Home/2015/02-28 Oval Track/Evert Pictures/IMG_7383.JPG

Maik
Comment 28 caulier.gilles 2016-07-10 10:32:53 UTC
Maik,

even if images are broken, or if video/raw files crash Exiv2, all Exiv2 API are wrapped around exception catch which must be handle by high level  implementation in digiKam. 

Typically if a file break Exiv2, we must able to take the right direction to prevent an broken workflow in face detection threads.

It's know that Exiv2 < 0.25 are very sensible of video files for ex, but it's better now with last 0.25 stable release.

Evert;

Just to be sure, which Exiv2 version do you use exactly ?

Gilles Caulier
Comment 29 Evert Vorster 2016-07-10 10:34:54 UTC
Maik, thanks again for looking into this. 

I am currently doing a valgrind log hoping to catch another segfault. It's just so terribly slow!
On every try I get about 3 - 6% through the database before digikam quits. Like I mentioned before there seems to be two distict issues, one where the database goes away, and one where digikam segfaults. 

I have applied the small patch from Gilles,  but it does not seem to have made a difference to either error. 

I have to re-iterate that this is scanning through the database doing faces detect only. If I do faces detect and recognize, the process completes very quickly, and no faces are detected. 

The four pictures listed above open fine with Gwenview.
Comment 30 Evert Vorster 2016-07-10 10:38:41 UTC
Gilles:
extra/exiv2 0.25-3 
and libkexiv2 from git, r782.6c196e4-1
Comment 31 caulier.gilles 2016-07-10 10:41:18 UTC
My patch must be the right direction but certainly not at the right level in source code.

Look well as we touch data not initialized while OpenCV API call  It sound like a non re-entrancy somewhere.

Remember that face fingerprints while detection is computed in a separated thread. It's more complex when multi-cores are used. Perhaps the patch must be applied in some top level call in facedetector class.

Gilles Caulier
Comment 32 caulier.gilles 2016-07-10 10:41:57 UTC
Exiv2 is fine.

We don't use libkexiv2 anymore. Implementation is not in digiKam core, to reduce the puzzle.

Gilles Caulier
Comment 33 caulier.gilles 2016-07-10 10:43:55 UTC
There is some test to do , if possible.

1/ using OpenCV2 instead OpenCV3. There is a flags to turn off in digiKam cmake configuration script before to compile.
2/ using sqlite database instead Mysql, to see if crash is reproducible.
3/ in all case single core and multicore must be tested to validate.

Gilles Caulier
Comment 34 Maik Qualmann 2016-07-10 10:53:25 UTC
Created attachment 99986 [details]
facedetector.patch

Please test also this patch.

Maik
Comment 35 Evert Vorster 2016-07-10 11:07:15 UTC
I am currently running digikam through valgrind. Once that crashes, I will install Maik's patch only, and see if I can reproduce the issue. 

I will notify the maintainer of digikam-git that libexiv2 is no longer a requirement for digikam. 

I can turn off the external database for testing purposes, but having the database external is very much a desired feature for me. 

Unfortunately, downgrading to opencv 2 is not an option on this machine, there is too much of my other software that depend on it.
Comment 36 caulier.gilles 2016-07-10 12:29:07 UTC
>I will notify the maintainer of digikam-git that libexiv2 is no longer a requirement for >digikam. 

Not libexiv2, but libkexiv2.

For dependencies details, look here :

https://quickgit.kde.org/?p=digikam-software-compilation.git&a=blob&f=DEPENDENCIES

Gilles Caulier
Comment 37 caulier.gilles 2016-07-10 12:31:06 UTC
Maik,

Your facedetector.patch must be applied to git/master in all cases...

Gilles Caulier
Comment 38 Evert Vorster 2016-07-10 13:13:58 UTC
Created attachment 99987 [details]
gdb crash log with facedetect patch loaded.

Unfortunately, it seems that the facedetect patch did not stop the segfaults. 

I ran it through gdb, as that is quite a bit faster than valgrind. This crash was reproduced by just scanning for faces, no recognition, and not surfing around in the detected faces tags. I found that just changing the tag associated to a picture will segfault digikam, in fact, that happens quite a lot. 

I will now go disable the external mysql, and see if I can make it crash.
Comment 39 caulier.gilles 2016-07-10 13:16:38 UTC
>I found that just changing the tag associated to a picture will segfault digikam

Do you mean a simple tag to image or a face tag ? Both are different in the way to process data in background.

Gilles Caulier
Comment 40 Evert Vorster 2016-07-10 13:20:12 UTC
I was just updating some of the face tags that the recognizer got wrong, without scanning for new faces, and this rather quickly segfaults digikam. 

I am now starting up digikam with the sqlite db, and of course it has to scan through all my photos first, which takes a while.
Comment 41 Evert Vorster 2016-07-10 15:30:17 UTC
Created attachment 99988 [details]
Segfault with facedetect patch applied, running on SQlite db

Looks like it's not the database backend, then, as it segfaults in exactly the same way when running internal sqlite instead of MySQL. 

I suppose the only thing left to do now would be to get it to segfault while running valgrind, right?
Comment 42 caulier.gilles 2016-07-10 15:55:19 UTC
Here with sqlite, the crash is not reproducible. I compiled OpenCV3 myself and uninstalled openCV2 before, to prevent binaries mix of the library.

Using valgrind reduce execution speed as you have seen. The program is executed in a "VM" like which check all memory allocation and use. If the crash is due to a race condition as i suspect, you will not able to reproduce it. This is why i would to add a mutex is the critical section of the OpenCV call code to be sure that method is not called more than one time by separated threads.

But as you say that crash is also reproducible just when tagging face as well. perhaps the problem is in OpenCV as well... I don't know....

This is why a check with OpenCV2 can be interesting to do.

Gilles Caulier

Gilles Caulier
Comment 43 caulier.gilles 2016-07-10 16:04:42 UTC
Do you have opencv3-contrib installed. This one include face detection algorithm.

Typically for 5.0.0 i included this opencv-face directly in digiKam core, as opencv 2 to 3 still a transition switch.

If code backported few mont ago has a bug, i tried to update it as well without success. 

So the idea is to require opencv contrib at configuration time though digiKam cmake script and to remove opencv-face module from digiKam core and use system based module instead.

Gilles Caulier
Comment 44 Evert Vorster 2016-07-10 17:22:49 UTC
Created attachment 99990 [details]
The list of files from OpenCV

I am on Arch linux, we don't have contrib. The closest thing is self compile from git. :)
I have attached the file list of the OpenCV I have installed on my system. It's version 3.1.0, and looks like it includes face detection. 

What is weird about this bug is that it does do quite a few face detections before it fails, but not always at the same point.
Comment 45 caulier.gilles 2016-07-10 19:07:39 UTC
Created attachment 99991 [details]
patch to use face module from opencv_contrib instead from digiKam core

Evert,

New patch to drop face module in digiKam core and use last one from opencv_contrib.
Opencv3 need to be compiled with OpenCV_Contrib module of course...

Maik,

I have a big doubt on the right way to  implement this new pure virtual method :

void predict(InputArray src, Ptr<PredictCollector> collector) const = 0;

... defined in face.hpp and to code in facerec_borrowed.cpp

Gilles Caulier
Comment 46 caulier.gilles 2016-07-10 19:17:14 UTC
With my patch, detection process my whole collection with 2271 images using sqlite database.

Yes, it's not too huge but it's a test collection on my virtual machine.

Until now, all work fine. No crash, faces are detected (15%)

Gilles Caulier
Comment 47 Maik Qualmann 2016-07-10 20:00:45 UTC
Gilles,

I have a patch created for the new face recognition already a few months ago. But I see a slight change in the ABI to the current code from github. OpenCV minimum would be version 3.1. I will upload the patch for testing here.

Maik
Comment 48 Maik Qualmann 2016-07-10 20:28:45 UTC
Created attachment 99992 [details]
face31.patch

New face modul from OpenCV-3.1. But not the latest version.

Maik
Comment 49 caulier.gilles 2016-07-10 20:33:09 UTC
Evert,

The face scan with my patch is finished with 244 faces found and no crash....

Now i will test with a Mysql internal database.

Gilles Caulier
Comment 50 caulier.gilles 2016-07-10 21:00:33 UTC
Evert,

With my patch, and Mysql remote server, face scan is started over the same collection and no crash appear until now.... Wait and see...

Gilles Caulier
Comment 51 caulier.gilles 2016-07-10 21:02:59 UTC
Screenshot of face scan with remote mysql in action :

https://www.flickr.com/photos/digikam/28121515122/in/dateposted-public/

Gilles Caulier
Comment 52 Maik Qualmann 2016-07-10 21:05:52 UTC
My packages for OpenCV are not using a TBB (libtbb2). Your Yes.
It could be related herewith:

http://code.opencv.org/issues/4489

Maik
Comment 53 caulier.gilles 2016-07-10 21:24:01 UTC
Ever,

Faces scan with Mysql dataabse server is complete without a crash. I used single core to compute face detection.

I will not try to use whole cpu core.

Gilles Caulier
Comment 54 Evert Vorster 2016-07-11 04:37:18 UTC
Gilles,
 I found the contrib version of OpenCV, but the network link I am behind is terribly slow, and I have been unable to download and compile it. I tried your patch anyways, but digikam absolutely requires the contrib package of opencv. 

Right now I am trying Maik's patch, as that allows me to compile against the version of OpenCV that is included in standard Arch. 

In a couple of weeks I will be home, with faster and more reliable internet, and then I will try your patch again.
Comment 55 Maik Qualmann 2016-07-11 04:44:20 UTC
The cause is compiled libtbb2 support in OpenCV. Gilles, look in the crash logs from Evert, our distributions have libtbb2 disabled in OpenCV. See my Comment 52

Maik
Comment 56 Evert Vorster 2016-07-11 07:57:06 UTC
Now we are getting somewhere. 

I finally managed to compile a version of opencv that has tbb disabled.

With Maik's patch installed I was able to scan through 25% of my library before I hit the bug of "MySQL server has gone away", so definitely have one issue nailed down to opencv's use of tbb. 
I can try again with internal SQlite, and see if I can scan through my collection completely. 

Since this version of opencv I have now is the contrib version, I will try again to build with Gilles' patch. 

Looking back over the patches, which, if any, are recommended for me?
Do I file a bug report at OpenCV for the crashes in TBB? (I did a bit of reading about it, and appears that the memory errors that valgrind reports is because TBB uses their own scheduler and that can confuse valgrind)
Comment 57 Evert Vorster 2016-07-11 09:56:09 UTC
Created attachment 100002 [details]
Compile error Gilles patch to use external face module
Comment 58 Evert Vorster 2016-07-11 09:59:40 UTC
OK, so now DigiKam crashes with both internal and external SQL, at approximately 30% through my collection. I'll start a new bug for that. 
I get a compile error with Gilles' patch, and Maik's seems to head off the error with TBB if it is disabled in opencv.
I'll compile digikam without any patch, and run it against my collection with both databases to confirm that Maik's patch is needed in conjunction with a TBB free opencv.
Comment 59 caulier.gilles 2016-07-11 10:16:30 UTC
Note : with my patch, multi-coree enabled, and mysql server, no crash. All work fine.

I didn't see the tbb opencv dependency. I will look if mine as this module enabled.

Gilles Caulier
Comment 60 Evert Vorster 2016-07-11 12:32:56 UTC
Hi there.. 

OK, so, I recompiled digikam with no patch, and ran the face detection against my picture library. 
The face detect would usually fail anywhere from 1 to 4% through the library. 

Ever since I have recompiled opencv to not use TBB, digikam scans much further into the library, regardsless of whether there is a patch applied or not. So I think we are at the root of this issue.

I will open a bug report with opencv, as the bug is in their handling of TBB.

I will open another bug for the failures I am seeing now, just to keep the troubleshooting that was done seperate and not confuse the issue. Thank you very much for your help!
Comment 61 Evert Vorster 2016-07-11 13:37:12 UTC
Small correction, it seems my source tree was not cleaned as thoroughly as I thought it was, and so I appear to have built digikam with all the patches applied when I thought it was not. 
My apologies. I am now rebuilding vanilla digikam, to see if it still fails with TBB disabled in OpenCV.
Comment 62 Evert Vorster 2016-07-11 16:49:21 UTC
Just tested with the vanilla code, and digikam does indeed crash earlier than with the patch. I am now installing just Maik's patch, and re-running it.
Comment 63 Evert Vorster 2016-07-12 05:49:15 UTC
Update on the status. 
I am now compiling digikam with all the patches except for Gilles' patch to use face module form opencv_contrib instead from digiKam core. That patch causes a compile error. I think it's due to the abi version of opencv that I have is slightly different. 

With just Maik's face31.patch installed I was able to scan through my collection with about 4 tries. I ended up with about 4000 faces identified out of 32000 photos. 
I am now going to try again with all the patches installed and see if they address different issues.
Comment 64 caulier.gilles 2016-07-12 05:59:50 UTC
Evert,

My patch use OpenCV 3.1 tarball. OpenCV-contrib come from github as well (current implementation. As i know there is no OpenCV-contrib tarball released. This is very problematic as API/ABI version can be the hell to follow. This is why i included face module code in digiKam core as well instead to use OpenCV-contrib instead. Sound like my firts choice is not too bad after all.

As i already said somewhere and sometime, openCV is a another big puzzle...

I can update code in digiKam core about face module, as i know now what i need to change to compile fine with OpenCV 3.1.

Gilles Caulier
Comment 65 caulier.gilles 2016-07-12 06:06:03 UTC
As i said previously, i processed on my VM with my collection all faces scan with Mysql server + multicore CPU support. No crash.

About TBB support in OpenCV 3.1 that i compiled, i don't set anything to configure OpenCV, excepted the fact to pass path to OpenCV-Contrib modules source code to include Face module as well. That all.

As i can see TBB is disabled by default, so i suspect that no TBB here.

Gilles Caulier
Comment 66 Maik Qualmann 2016-07-12 16:30:47 UTC
Created attachment 100038 [details]
currentFace31.patch

New patch, face modul updated to the latest version of OpenCV3.1.
Please test it with  this patch.

Maik
Comment 67 Evert Vorster 2016-07-12 16:59:43 UTC
Thanks for the updated patch. It's building now. 

My latest test was with all the patches installed, and it scanned all the way through my collection.  There are a few smaller bugs where the program does not work as expected that could very well be due to the ABI change, so I'll test all of them again.
Comment 68 Evert Vorster 2016-07-13 16:10:11 UTC
Maik, your patch works beautifully. 
With yours and Gilles patches installed I am unable to make digikam segfault. 
This is running with external MySQL db, and updating the unknown pictures as they come up.. 

Pretty solid. Unfortunately the actual face recognition is pretty poor, with only a 10% successful recognition. ie: recognizes a face that I have provided hundreds of examples for. 

Will this become part of the standard digikam?
Comment 69 caulier.gilles 2016-07-13 16:23:05 UTC
Ever,

Please do not close this file, we must apply patches to git/master before to mark it as resolved.

Q: Your opencv still compiled with Intel TBB support ?

The recognition algorithm need to be improved, that true. We have currently a student working on a eyes auto detection and correction which will come as an extension to face engine in digiKam. If project is completed, we will assign recognition improvement next summer (not before as it will be a part of GoSC)

Maik,

What's will be the synthesis about this file to apply patches on git/master exactly ?

Gilles Caulier
Comment 70 Maik Qualmann 2016-07-13 16:33:05 UTC
The minimum version of OpenCV3 we must set to V3.1. OpenCV2 should compile, but tested only a few months ago.

Maik
Comment 71 caulier.gilles 2016-07-13 16:36:12 UTC
Maik, well apply the patches. 

MXE still with openCV 2.4.X
Macport is already openCV 3.1.X

So i can check openCV 2 when patches will be applied.

Gilles
Comment 72 Maik Qualmann 2016-07-13 17:19:31 UTC
Git commit 8cdfcc52f402b44378fe8e6a9b7961585e17340e by Maik Qualmann.
Committed on 13/07/2016 at 17:17.
Pushed by mqualmann into branch 'master'.

apply patch #99970 to add mutex to prevent non re-entrency in OpenCV API

M  +6    -0    libs/facesengine/detection/opencvfacedetector.cpp

http://commits.kde.org/digikam/8cdfcc52f402b44378fe8e6a9b7961585e17340e
Comment 73 Maik Qualmann 2016-07-13 17:24:14 UTC
Git commit 3e31dad1c6c10bb6da6a269a007eee5cc209e412 by Maik Qualmann.
Committed on 13/07/2016 at 17:22.
Pushed by mqualmann into branch 'master'.

apply patch #99986 to check for a valid QImage

M  +5    -0    libs/facesengine/facedetector.cpp

http://commits.kde.org/digikam/3e31dad1c6c10bb6da6a269a007eee5cc209e412
Comment 74 Maik Qualmann 2016-07-13 17:30:20 UTC
Git commit 88123604ccac3cdda4557273f1b280d6772adc31 by Maik Qualmann.
Committed on 13/07/2016 at 17:27.
Pushed by mqualmann into branch 'master'.

apply patch #100038 to update openCV3 face modul to the current version

M  +1    -0    libs/facesengine/CMakeLists.txt
M  +8    -24   libs/facesengine/opencv3-face/eigen_faces.cpp
M  +17   -5    libs/facesengine/opencv3-face/face.hpp
M  +0    -9    libs/facesengine/opencv3-face/face_basic.hpp
M  +13   -0    libs/facesengine/opencv3-face/facerec.cpp
M  +2    -5    libs/facesengine/opencv3-face/facerec.hpp
M  +8    -25   libs/facesengine/opencv3-face/fisher_faces.cpp
M  +8    -25   libs/facesengine/opencv3-face/lbph_faces.cpp
M  +5    -2    libs/facesengine/opencv3-face/precomp.hpp
A  +114  -0    libs/facesengine/opencv3-face/predict_collector.cpp     [License: Unknown license]  *
A  +127  -0    libs/facesengine/opencv3-face/predict_collector.hpp     [License: Unknown license]  *
M  +38   -0    libs/facesengine/recognition-opencv-lbph/facerec_borrowed.cpp
M  +9    -0    libs/facesengine/recognition-opencv-lbph/facerec_borrowed.h

The files marked with a * at the end have a non valid license. Please read: http://techbase.kde.org/Policies/Licensing_Policy and use the headers which are listed at that page.


http://commits.kde.org/digikam/88123604ccac3cdda4557273f1b280d6772adc31
Comment 75 Maik Qualmann 2016-07-13 17:38:33 UTC
Git commit 78f21055e816ae1dc219185b7497f3c1299629e3 by Maik Qualmann.
Committed on 13/07/2016 at 17:37.
Pushed by mqualmann into branch 'master'.

set minimum openCV3 version to 3.1.0

M  +3    -3    CMakeLists.txt

http://commits.kde.org/digikam/78f21055e816ae1dc219185b7497f3c1299629e3
Comment 76 caulier.gilles 2016-07-13 20:18:36 UTC
No problem to compile with OpenCV2

I close this file now.

Gilles Caulier