Bug 430996 - Yolo v3 causes application to crash with 32 bits version
Summary: Yolo v3 causes application to crash with 32 bits version
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Faces-Recognition (show other bugs)
Version: 7.2.0
Platform: Microsoft Windows Microsoft Windows
: NOR crash
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-31 04:05 UTC by Robert Golden
Modified: 2021-01-06 04:16 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 7.2.0
Sentry Crash Report:


Attachments
Log file (225.92 KB, text/plain)
2020-12-31 18:58 UTC, Robert Golden
Details
digicam debug crash (90.67 KB, text/plain)
2020-12-31 19:11 UTC, Robert Golden
Details
Task Manager (2.55 MB, image/png)
2020-12-31 19:13 UTC, Robert Golden
Details
Working set (3.57 MB, image/png)
2021-01-02 03:02 UTC, Robert Golden
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Golden 2020-12-31 04:05:34 UTC
SUMMARY

I just installed 7.2.0 beta 2 on Windows 10. It seems that running the Mojo V3 algorithm causes the application to crash. I have "Disable OpenCL turned on" and have tried with and without "Work on all processor cores".

STEPS TO REPRODUCE
1. 
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 10 Pro 19041.685
KDE Frameworks 5.77.0
Qt 5.15.2 (built against 5.15.2)


ADDITIONAL INFORMATION


I need to figure out how to get some logs or a stack trace. Once I do that I'll update the bug.
Comment 1 Maik Qualmann 2020-12-31 05:51:28 UTC
We need the debug output from the DebugView program, as described here:

https://www.digikam.org/contribute/

Maik
Comment 2 Maik Qualmann 2020-12-31 06:07:35 UTC
Another question, if you installed digiKam-7.2.0, did you restart digiKam after downloading the large binary files?

Maik
Comment 3 Robert Golden 2020-12-31 18:58:40 UTC
Created attachment 134422 [details]
Log file

Looks like this is the problem:
digikam.dimg: Failed to allocate chunk of memory of size 8992000
Comment 4 Robert Golden 2020-12-31 19:11:10 UTC
Created attachment 134423 [details]
digicam debug crash

So I uninstalled the non-debug version and installed the debug version, and now it won't start.

Originally it showed:
[10828] digikam.facedb: "C:/Users/family/AppData/Roaming/digikam/facesengine/openface_nn4.small2.v1.t7"
[10828] QWaitCondition: Destroyed while threads are still waiting
[10828] QEventDispatcherWin32::wakeUp: Failed to post a message (Invalid window handle.)

and crashed

So I removed the C:/Users/family/AppData/Roaming/digikam/facesengine directory. I was then prompted to download again, but it crashed with: 
[15620] QThread::start: Failed to create thread (Not enough memory resources are available to process this command.)

My system shows 8GB available.
Comment 5 Robert Golden 2020-12-31 19:13:01 UTC
Created attachment 134424 [details]
Task Manager
Comment 6 Robert Golden 2020-12-31 19:25:17 UTC
Re #2: I did not originally. I have since due to the crashes.

I tried just scanning "F:/Pictures/2014-03-24/" and it crashed again but with a different image.


[18032] digikam.dimg: "F:/Pictures/2014-03-24/IMG_2663.JPG" : "JPEG" file identified
[18032] digikam.dimg: Failed to allocate chunk of memory of size 8992000

So I don't think it's related to the image.
Comment 7 Maik Qualmann 2021-01-01 09:43:33 UTC
We and Qt get a memory allocation error, although there is probably still enough. I googled a little and found a German article from Microsoft from December 2020. It describes that there is currently a problem in Windows 10, that applications that request a lot of memory quickly, sometimes fail. The article was about compiling program code. The help is currently to set the swap file on all drives to a fixed size, about 1.5 x the RAM.

German article link:
https://docs.microsoft.com/de-de/troubleshoot/windows-client/performance/slow-page-file-growth-memory-allocation-errors

Maik
Comment 8 caulier.gilles 2021-01-01 10:26:13 UTC
Hi Maik, and happy new year of course (:-)))

In my office we use some Windows 10 to run fast acquisition measurement systems, as you know (mostly Linux system of course).

Huge memory allocation is performed. It work mostly all the time, but if allocation failure appears, we reiterate 3 time in a simple loop to prevent wrong Windows memory management in this case. Note that systems has at least 32 gb of ram for 8gb allocated for measurements.

This kind of system dysfunction appears only for few Windows 10 revisions (update). It never append under Linux.

Gilles
Comment 9 Maik Qualmann 2021-01-01 10:47:20 UTC
Hi Gilles,

The happy new year too. This is interesting, we could also request memory 3 times in a loop, but functions within Qt such as QThread will not do this. Then maybe we try the solution here in the article:

The English article:
https://docs.microsoft.com/en-us/troubleshoot/windows-client/performance/slow-page-file-growth-memory-allocation-errors

I don't need to explain what I think about Windows and its memory management? ((:-))

Maik
Comment 10 Robert Golden 2021-01-01 18:29:03 UTC
I set my swap to 24 GB - 32 GB:

Without OpenCV:
[6128] QLayout: Attempting to add QLayout "" to QWidget "", which already has a layout
[6128] digikam.dimg: Failed to allocate chunk of memory of size 8992000

Enabling OpenCV:

[1244] digikam.general: parent is null
[1244] digikam.facesengine: cv::Exception: OpenCV(4.4.0) /mnt/data/GIT/7.x/project/bundles/mxe/temp.build/ext_opencv/ext_opencv-prefix/src/ext_opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 18874368 bytes in function 'OutOfMemoryError'
[1244] 
[1244] digikam.facesengine: cv::Exception: OpenCV(4.4.0) /mnt/data/GIT/7.x/project/bundles/mxe/temp.build/ext_opencv/ext_opencv-prefix/src/ext_opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 18874368 bytes in function 'OutOfMemoryError'
[1244] 
[1244] digikam.facesengine: cv::Exception: OpenCV(4.4.0) /mnt/data/GIT/7.x/project/bundles/mxe/temp.build/ext_opencv/ext_opencv-prefix/src/ext_opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 26976000 bytes in function 'OutOfMemoryError'
[1244] 
[1244] digikam.dimg: Failed to allocate chunk of memory of size 8992000


None of my other programs are acting up. Could it be the QT allocator is not handling some allocation return code correctly? Is it using some kind of custom memory alloator?
Comment 11 Maik Qualmann 2021-01-01 20:29:23 UTC
Sorry, something is wrong with your system. We have OpenCV, Qt and digiKam that have problems allocate memory. The memory sizes are between 8-25MB, so very small. I have no ideas. Some strange anti-virus program blocking digiKam?

Maik
Comment 12 Robert Golden 2021-01-02 03:01:53 UTC
I'm not convinced it's an OS issue. It's a fresh install of windows as of 3 months ago. No anti-virus or any of that kind of software.

digiKam is a 32-bit application. So that means it gets 2 GiB of RAM by default. Is digiKam configured with /LARGEADDRESSAWARE?
https://docs.microsoft.com/en-us/cpp/build/reference/largeaddressaware-handle-large-addresses?view=msvc-160&viewFallbackFrom=vs-2019

When idle digiKam has a ~424 MiB working set. When I start face recognition it jumps to 1.4 GiB. I'm not sure if it's actually even started running the algorithm though. The 1.4 limit feels pretty close to the 2 GiB limit of 32-bit processes.

My database has about 62K files. Not sure if that has anything to do with it.
Comment 13 Robert Golden 2021-01-02 03:02:36 UTC
Created attachment 134449 [details]
Working set
Comment 14 caulier.gilles 2021-01-02 06:33:17 UTC
No digiKam is not a 32 bits application.

We provide 64 bits and 32 bits versions. The second one must be used ONLY on old computers and is deprecated. 64 bits must always be used now everywhere.

DYour large images loading is normal with 32 bits, as memory access is limited with this version. Use the 64 bits instead, and all must be fine.

https://files.kde.org/digikam/digiKam-7.2.0-rc-20210101T142426-Win64.exe.mirrorlist

Gilles Caulier
Comment 15 caulier.gilles 2021-01-02 06:35:57 UTC
Maik, 

We need seriously to review the 32 bits versions of bundles which must be removed in the future to prevent this kind of dysfunctions.

There is no 32 bits operating system anymore. This kind of target become a niche.

Gilles
Comment 16 Maik Qualmann 2021-01-02 06:43:45 UTC
Hi Gilles,

Yes, definitely. I think the first step would be no more 32 bit snapshots, also AppImage. Then we'll see if anyone asks about it.
I will add a debug message at the start to show which version is currently running.

Maik
Comment 17 Maik Qualmann 2021-01-02 10:16:31 UTC
Git commit 352a22d5ed97288a3a45ac067e50b868ecbad56b by Maik Qualmann.
Committed on 02/01/2021 at 10:15.
Pushed by mqualmann into branch 'master'.

recognize wrong program and windows cpu version and show a message box

M  +14   -0    core/app/main/main.cpp
M  +15   -0    core/showfoto/main/main.cpp

https://invent.kde.org/graphics/digikam/commit/352a22d5ed97288a3a45ac067e50b868ecbad56b
Comment 18 Robert Golden 2021-01-02 15:17:55 UTC
Ah great. When I looked in https://download.kde.org/unstable/digikam/ I downloaded digiKam-7.2.0-beta2-Win32.exe. I didn't notice the Win64 version because the filename pattern has a date in it: digiKam-7.2.0-beta2-20201227T235133-Win64.exe. I'll install the 64-bit version, try that out, and report back. Thanks!
Comment 19 Robert Golden 2021-01-03 03:32:19 UTC
64-bit build fixed it! When initially starting face recognition the RAM usage jumps up to 2.4 GiB whereas the 32-bit version started with 1.4 GiB. I've been running face recognition most of the day and the peak working set has hit 5.4 GiB.

Does Yolo v3 support hardware acceleration? I disabled the "Disable hardware acceleration OpenCL", but I don't see my GPU being used (according to the task manager). I have a Nvidia GTX 1080. Right now I'm getting ~7 images/second on a i7-6700K @ GHz with the "Work on all processor cores" enabled.

Thank you all for helping me track down the problem! :)
Comment 20 caulier.gilles 2021-01-03 09:46:35 UTC
Yolo is a data model to detect and recognition faces. It's not algorithms. These last one are in OpenCV libraries, and is not dependent of the data model.

The OpenCL options are for OpenCV for special use cases. This depend also of hardware in video cards and proprietary drivers (or not). From my point of view as long time developer, this a hell to manage. All combinations between hardware and software versions can make a mess. And also this depend of the operating system, and i'm sure that under Linux, the quality of these combinations are poor.

This is why this OpenCL option exists. We pass the boolean to OpenCV libraries which turn on/off the acceleration codes. we don't have any other control. 

This is why i hate the hardware acceleration in general, because this make the target application unstable and very hard to stabilize/optimize.

I use also a recent NVIDIA video card on my linux desktop, and i don't see any speed improvement while scanning huge collection.

But in opposite of you, i never reproduce the huge memory allocation while face scanning. And we recieve already few reports about this problem and it's a long story to try to hack this.

Best regards and happy new year

Gilles Caulier
Comment 21 caulier.gilles 2021-01-03 09:52:46 UTC
Maik, 

It will be time to decide if we continue to support 32 bits bundles of digiKam. If we drop 32 bits, we need to plan the last version to support it (as for ex 7.2.0 will be the last version for ex).

Compiling windows versions 32 and 64 bits take 3h30 now and Linux 32 bits is done oa a specific VM. It take 1h30... 

I plan to build a phabricator computer in my office to build bundle with a crontab. for the moment all has started manually as all run on different computer.

About the macOS version, i make a VM for this task using the famous one click project from github to install macOS in VirtualBox. Also we need a Silicon version to plan in the future, as Intel computers from Apple will migrate progressively to ARM. But it's for later, as not all dependencies build properly using Macports for the moment.

Gilles
Comment 22 caulier.gilles 2021-01-04 09:21:58 UTC
Git commit 29f929e0a750671c36b45c51768869f6eff145ff by Gilles Caulier.
Committed on 04/01/2021 at 09:19.
Pushed by cgilles into branch 'master'.

Only process Windows 64 bits installer.
Drop Windows 32 bits version.

M  +1    -1    project/bundles/mxe/01-build-mxe.sh
M  +1    -1    project/bundles/mxe/02-build-extralibs.sh
M  +1    -1    project/bundles/mxe/03-build-digikam.sh
M  +1    -1    project/bundles/mxe/04-build-installer.sh
M  +7    -7    project/bundles/mxe/README
M  +1    -1    project/bundles/mxe/common.sh
M  +2    -2    project/bundles/mxe/config.sh
M  +1    -1    project/bundles/mxe/fixqtwebkitincludes.sh
M  +1    -1    project/bundles/mxe/icon-rcc/CMakeLists.txt
M  +1    -1    project/bundles/mxe/installer/digikam.nsi
M  +1    -1    project/bundles/mxe/installer/events_functions.nsh
M  +1    -1    project/bundles/mxe/installer/process_running.nsh
M  +1    -1    project/bundles/mxe/installer/readme_page.nsh
M  +1    -1    project/bundles/mxe/installer/reboot_required.nsh
M  +2    -23   project/bundles/mxe/makeall.sh
M  +1    -1    project/bundles/mxe/rll.py
M  +1    -19   project/bundles/mxe/update.sh

https://invent.kde.org/graphics/digikam/commit/29f929e0a750671c36b45c51768869f6eff145ff
Comment 23 caulier.gilles 2021-01-04 10:07:38 UTC
Git commit fd85e5002c0a0dc24a4583c67c3f157b55d35d29 by Gilles Caulier.
Committed on 04/01/2021 at 10:07.
Pushed by cgilles into branch 'master'.

Drop 32 bits AppImage build

M  +1    -1    project/bundles/appimage/01-build-host.sh
M  +1    -1    project/bundles/appimage/02-build-extralibs.sh
M  +1    -1    project/bundles/appimage/03-build-digikam.sh
M  +1    -1    project/bundles/appimage/04-build-appimage.sh
M  +3    -3    project/bundles/appimage/README
M  +1    -1    project/bundles/appimage/common.sh
M  +1    -1    project/bundles/appimage/config.sh
M  +1    -1    project/bundles/appimage/icon-rcc/CMakeLists.txt
M  +1    -1    project/bundles/appimage/makeall.sh
M  +1    -1    project/bundles/appimage/update.sh

https://invent.kde.org/graphics/digikam/commit/fd85e5002c0a0dc24a4583c67c3f157b55d35d29
Comment 24 caulier.gilles 2021-01-06 04:16:35 UTC
Git commit 15a7b677fc283df70a6e2534ea4523a08d7cbc87 by Gilles Caulier.
Committed on 06/01/2021 at 04:14.
Pushed by cgilles into branch 'dev'.

Drop 32 bits versions from download page

M  +1    -4    data/release.yml
M  +2    -5    themes/hugo-theme-digikam/layouts/partials/downloads.html

https://invent.kde.org/websites/digikam-org/commit/15a7b677fc283df70a6e2534ea4523a08d7cbc87