Bug 426086 - Tiff file reading extremely slow
Summary: Tiff file reading extremely slow
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Plugin-DImg-TIFF (other bugs)
Version First Reported In: 7.6.0
Platform: Microsoft Windows Microsoft Windows
: NOR minor
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-02 02:38 UTC by kdeb
Modified: 2022-02-20 14:36 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In: 7.6.0
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kdeb 2020-09-02 02:38:04 UTC
SUMMARY
Using current Win64 weekly. Observed while rebuilding face thumbnails in background. Assume it is a general tiff reading problem. Tiff files were saved from photoshop cc 21.2. 


STEPS TO REPRODUCE
1. Maintenance clean thumbnails which wipes face thumbnails.
2. View faces to trigger thumbnail rebuild. 
3. Observe disk usage

OBSERVED RESULT
Tiff files are read at about 4MB/s with almost 100% disk active time. Panasonic RAW files (for example) on the same drive are read at about 60MB/s with about 12% disk active time. 

Using sysinternals diskmon I noted that the tiff files were being read in 64k blocks backwards, that is the read sector numbers got lower. RAW files were read in 256k blocks with increasing sector numbers. I assume the terrible performance is due to lack of OS and/or drive read ahead caching because digikam is reading behind - lol. The tiff files are about 450MB. Don't know if digiKam needs to read the whole file if so it would take nearly 2 minutes each.

EXPECTED RESULT
Tiff reading not to be so slow.

SOFTWARE/OS VERSIONS
Windows: 10

ADDITIONAL INFORMATION
Think this might be some specific tiff file format? The files were probably initially created with DxO Photolab then edited and saved in photoshop.
Comment 1 Maik Qualmann 2020-09-02 06:07:44 UTC
I think it depends on the TIFF file. Depending on the TIFF type, we use the strip-oriented image I/O from libtiff. I could imagine that it is slower to read from disk line by line, especially under Windows.

More information about the TIFF format of your files or a sample image would be good.

Maik
Comment 2 kdeb 2020-09-02 21:35:13 UTC
Bit hard for me to tell what is going on especially when it is easy to have these files in the system disk cache which hides the problem and difficult to know it is in the cache. 

Seems to be related to additional layers saved in photoshop and possibly in particular smart object layers. I took a 20 M pixel 16 bit tiff file which was about 100MB. Loaded in photoshop, duplicated the background layer, turned that layer into a smart object and applied a simple filter then saved the file as tiff. The file size increased by about 600MB and it chokes digiKam but nothing else. I guess digiKam is slowly reading the large extra elements it doesn't need to. 

The problem files are huge. This is a tiffdump of one:
Magic: 0x4949 <little-endian> Version: 0x2a <ClassicTIFF>
Directory 0: offset 8 (0x8) next 0 (0)
SubFileType (254) LONG (4) 1<0>
ImageWidth (256) SHORT (3) 1<3109>
ImageLength (257) SHORT (3) 1<4663>
BitsPerSample (258) SHORT (3) 3<16 16 16>
Compression (259) SHORT (3) 1<8>
Photometric (262) SHORT (3) 1<2>
Make (271) ASCII (2) 10<Panasonic\0>
Model (272) ASCII (2) 6<DC-G9\0>
StripOffsets (273) LONG (4) 84<26978 985968 1957330 2933846 3906524 4887716 5874280 6864180 7862052 8865962 9875066 10882740 11884212 12894074 13906004 14919378 15934402 16947186 17960762 18976602 19997708 21021120 22044318 23069588 ...>
Orientation (274) SHORT (3) 1<1>
SamplesPerPixel (277) SHORT (3) 1<3>
RowsPerStrip (278) SHORT (3) 1<56>
StripByteCounts (279) LONG (4) 84<958990 971362 976516 972677 981192 986563 989900 997871 1003909 1009104 1007673 1001471 1009861 1011929 1013374 1015023 1012783 1013576 1015839 1021106 1023411 1023197 1025270 1028923 ...>
XResolution (282) RATIONAL (5) 1<72>
YResolution (283) RATIONAL (5) 1<72>
PlanarConfig (284) SHORT (3) 1<1>
ResolutionUnit (296) SHORT (3) 1<2>
Software (305) ASCII (2) 31<Adobe Photoshop 21.2 (Wi ...>
DateTime (306) ASCII (2) 20<2020:07:11 04:08:26\0>
700 (0x2bc) BYTE (1) 17251<0x3c 0x3f 0x78 0x70 0x61 0x63 0x6b 0x65 0x74 0x20 0x62 0x65 0x67 0x69 0x6e 0x3d 0x22 0xef 0xbb 0xbf 0x22 0x20 0x69 0x64 ...>
33723 (0x83bb) UNDEFINED (7) 31<0x1c 0x2 00 00 0x2 00 00 0x1c 0x2 0x37 00 0x8 0x32 0x30 0x32 0x30 0x30 0x32 0x30 0x39 0x1c 0x2 0x3c 00 ...>
34377 (0x8649) BYTE (1) 8058<0x38 0x42 0x49 0x4d 0x4 0x25 00 00 00 00 00 0x10 0xcc 0xf3 0xa8 0x58 0xb9 0xcd 0x3f 0xd2 0x8e 0xcd 0xc9 0x50 ...>
34665 (0x8769) LONG (4) 1<84116084>
ICC Profile (34675) UNDEFINED (7) 560<00 00 0x2 0x30 0x41 0x44 0x42 0x45 0x2 0x10 00 00 0x6d 0x6e 0x74 0x72 0x52 0x47 0x42 0x20 0x58 0x59 0x5a 0x20 ...>
37724 (0x935c) UNDEFINED (7) 502520192<0x41 0x64 0x6f 0x62 0x65 0x20 0x50 0x68 0x6f 0x74 0x6f 0x73 0x68 0x6f 0x70 0x20 0x44 0x6f 0x63 0x75 0x6d 0x65 0x6e 0x74 ...>

is the 37724 at the end a 500MB tag that tiffdump doesn't know? If so I guess that is what digikam has to be reading at 4MB/s to take so long.
Comment 3 kdeb 2020-09-02 21:50:10 UTC
More observation. Viewing one of these files in the digiKam slideshow takes about 2 minutes to load and digiKam memory usage slowly increases by about 600MB during that 2 minutes then falls back to about the original usage when the image is displayed. 

Sure looks like digiKam is reading all of this 500MB tag (for some reason backwards and slowly) into memory then discarding it.
Comment 4 kdeb 2022-02-16 13:54:25 UTC
This problem went away at some point but has returned even worse in 7.5 and the most recent 7.6. 

Here hopefully 

https://ufile.io/er79cpxm

will get you a photoshop saved16bit tiff file. I have a dozen applications on this machine which can load this file in less than 2 seconds. Digikam takes 21 seconds to view it or generate a thumbnail. Larger files with more photoshop baggage are even more ridiculously slow. A 531MB file which windows paint can load in about 3 seconds takes Digikam 142 seconds to view or to generate a thumbnail. During these long waits Digikam uses 100% of a CPU core (sometimes multiple cores presumably generating thumbnails in parallel). A stack trace of the Digikam thread taking all the CPU includes libtiff.  A couple of the other applications which load these files without problem use earlier versions of libtiff.
Comment 5 caulier.gilles 2022-02-16 14:47:54 UTC
Did you have the metadata view open in right sidebar, and especially the ExifTool tab ?
Comment 6 kdeb 2022-02-16 16:04:06 UTC
(In reply to caulier.gilles from comment #5)
> Did you have the metadata view open in right sidebar, and especially the
> ExifTool tab ?

Nope. Usually filters and I just checked and exiftool isn't running. 

I have a folder with a single big tiff in it. I right click the folder and select refresh then click on another folder and back on this one. The thumbnail for the tiff is a grey box. The process monitor at the bottom of the screen says "Thumbs" and 0%. 3-4 minutes and many billions of cpu cycles later the Thumbs process ends and I see the tiff thumbnail. 

While 'Thumbs' is running I looked at the Digikam process and there is one or more threads each of which is using 4.1% of a 12 core 24 thread 4.1GHz Ryzen CPU. The call stacks look like

ntoskrnl.exe!KeSynchronizeExecution+0x5b66
ntoskrnl.exe!KeWaitForSingleObject+0x1460
ntoskrnl.exe!KeWaitForSingleObject+0x98f
ntoskrnl.exe!KeWaitForSingleObject+0x233
ntoskrnl.exe!ExWaitForRundownProtectionRelease+0x7dd
ntoskrnl.exe!KeWaitForSingleObject+0x3a29
ntoskrnl.exe!KeSynchronizeExecution+0x3140
msvcrt.dll!putc
libtiff-5.dll!TIFFmemcmp+0x1fc8
libtiff-5.dll!TIFFmemcmp+0x39fa
libtiff-5.dll!TIFFmemcmp+0x10df
libtiff-5.dll!TIFFPredictorCleanup+0x9d
libtiff-5.dll!TIFFPrintDirectory+0x585
DImg_TIFF_Plugin.dll!ZN21DigikamTIFFDImgPlugin14DImgTIFFLoader4loadERK7QStringPN7Digikam18DImgLoaderObserverE+0x709
libdigikamcore.dll!ZN7Digikam4DImg4loadERK7QStringiPNS_18DImgLoaderObserverERKNS_12DRawDecodingE+0xa1f
libdigikamcore.dll!ZNK7Digikam16ThumbnailCreator18loadWithDImgScaledERK7QStringPNS_10IccProfileE+0xd3
libdigikamcore.dll!ZNK7Digikam16ThumbnailCreator15createThumbnailERKNS_13ThumbnailInfoERK5QRect+0x1146
libdigikamcore.dll!ZNK7Digikam16ThumbnailCreator4loadERKNS_19ThumbnailIdentifierERK5QRectb+0x105
libdigikamcore.dll!ZNK7Digikam16ThumbnailCreator4loadERKNS_19ThumbnailIdentifierE+0x2d
libdigikamcore.dll!ZN7Digikam20ThumbnailLoadingTask7executeEv+0x57a
libdigikamcore.dll!ZN7Digikam14LoadSaveThread3runEv+0x16a
libdigikamcore.dll!ZN7Digikam13DynamicThread7Private3runEv+0x30
Qt5Core.dll!ZN11QThreadPool6cancelEP9QRunnable+0x9a
Qt5Core.dll!ZN7QThread21setTerminationEnabledEb+0x103
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21


I notice that just clicking refresh starts one 4% CPU thread. Switching view to another folder and back triggers another 1 or 2 4% threads, looks like it is simultaneously trying to generate the same thumbnail.  Thumbnail generation is not *the* problem. After thumbnails are generated if I click on the thumbnail show full screen button I get 1 4% thread like 

ntoskrnl.exe!KeSynchronizeExecution+0x5b66
ntoskrnl.exe!KeWaitForSingleObject+0x1460
ntoskrnl.exe!KeWaitForSingleObject+0x98f
ntoskrnl.exe!KeWaitForSingleObject+0x233
ntoskrnl.exe!ExWaitForRundownProtectionRelease+0x7dd
ntoskrnl.exe!KeWaitForSingleObject+0x3a29
ntoskrnl.exe!KeSynchronizeExecution+0x3140
libtiff-5.dll!wcsrtombs+0x11e
libtiff-5.dll!wcsrtombs+0x13e
libtiff-5.dll!TIFFmemcmp+0x10c7
libtiff-5.dll!TIFFPredictorCleanup+0x9d
libtiff-5.dll!TIFFPrintDirectory+0x585
DImg_TIFF_Plugin.dll!ZN21DigikamTIFFDImgPlugin14DImgTIFFLoader4loadERK7QStringPN7Digikam18DImgLoaderObserverE+0x709
libdigikamcore.dll!ZN7Digikam4DImg4loadERK7QStringiPNS_18DImgLoaderObserverERKNS_12DRawDecodingE+0xa1f
libdigikamcore.dll!ZN7Digikam4DImg4loadERK7QStringPNS_18DImgLoaderObserverERKNS_12DRawDecodingE+0x17
libdigikamcore.dll!ZN7Digikam18PreviewLoadingTask7executeEv+0x777
libdigikamcore.dll!ZN7Digikam14LoadSaveThread3runEv+0x16a
libdigikamcore.dll!ZN7Digikam13DynamicThread7Private3runEv+0x30
Qt5Core.dll!ZN11QThreadPool6cancelEP9QRunnable+0x9a
Qt5Core.dll!ZN7QThread21setTerminationEnabledEb+0x103
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21

and see nothing on the screen for 3-4 minutes. 

I don't know if it is reading the tiff file backwards like the initial report, all the files are on SSD now all being read from the system cache. As I said windows paint manages to load a 500MB tiff in about 3 seconds.
Comment 7 kdeb 2022-02-18 01:03:07 UTC
Can't believe I had to find and fix this myself. 

I wondered why the stack trace contained msvcrt.dll!putc then I wondered why it contained libtiff-5.dll!TIFFPrintDirectory then I looked at the source and find one mention of TIFFPrintDirectory which is called if (DIGIKAM_DIMG_LOG_TIFF().isDebugEnabled() is true and I then find all debug logging options are set true for windows and macOS versions. 

I patched digkam.exe to nop out the  digikamSetDebugFilterRules(); call and now it takes 2-3 seconds to view or generate a thumbnail for a 500MB tiff file. Showfoto has the same problem. I don't see the point in non-debug builds logging things to nowhere.
Comment 8 Maik Qualmann 2022-02-18 07:04:18 UTC
Which digiKam Windows EXE are you using? If you use the one with "debug" in the filename it would explain it. You should definitely install the smaller installer without "debug" in the filename.

Maik
Comment 9 Maik Qualmann 2022-02-18 07:10:03 UTC
Ok, now I understand after looking at the source code. I'm very surprised that this debug rules is supposed to slow down digiKam so much.

Maik
Comment 10 caulier.gilles 2022-02-18 07:50:07 UTC
Maik,

I set this rules for Windows and MacOS, especially for Windows where setting the env. variables is more complex than under Linux.

I suppose that debug trace must pass by the Windows kernel and take age to be captured. After all it's Windows, not Linux.
Removing these rules in source code will be better. 

Gilles
Comment 11 Maik Qualmann 2022-02-18 09:27:43 UTC
Gilles,

I think we could add a "defined DEBUG" here to only have the output in the debug bundle.

https://invent.kde.org/graphics/digikam/-/blob/master/core/app/utils/digikam_debug.cpp#L81

Maik
Comment 12 kdeb 2022-02-18 12:02:03 UTC
(In reply to Maik Qualmann from comment #9)
> Ok, now I understand after looking at the source code. I'm very surprised
> that this debug rules is supposed to slow down digiKam so much.

I patched the release build of 7.5.

It is only TIFFPrintDirectory that kills performance. Seems like it is dumping the whole file a byte at a time or maybe just the huge photoshop layer tags. Assuming you got the example tiff I linked you could look at debug output and see.
Comment 13 caulier.gilles 2022-02-18 17:11:51 UTC
Maik,

The DEBUG pre-compiler directive will not work.

All codes for the bundles are always compiled with debug symbols. At last stage when bundles are packaged, we strip or not the debug symbols in binaries depending of the kind of wanted target.

Gilles
Comment 14 Maik Qualmann 2022-02-18 17:54:08 UTC
Ok, I'll create an option to turn debugging on/off in the system settings.

Maik
Comment 15 Maik Qualmann 2022-02-19 06:58:49 UTC
Git commit e6fa3ce7549cbe9013a4dd45e80cb2cb2497e7cd by Maik Qualmann.
Committed on 19/02/2022 at 06:57.
Pushed by mqualmann into branch 'qt5-maintenance'.

add option in the digiKam system settings to enable debug logging
FIXED-IN: 7.6.0

M  +2    -1    NEWS
M  +1    -1    core/app/main/main.cpp
M  +5    -10   core/app/utils/digikam_debug.cpp
M  +1    -1    core/app/utils/digikam_debug.h
M  +1    -1    core/showfoto/main/main.cpp
M  +3    -0    core/utilities/setup/system/systemsettings.cpp
M  +2    -0    core/utilities/setup/system/systemsettings.h
M  +7    -1    core/utilities/setup/system/systemsettingswidget.cpp

https://invent.kde.org/graphics/digikam/commit/e6fa3ce7549cbe9013a4dd45e80cb2cb2497e7cd
Comment 16 Maik Qualmann 2022-02-19 07:03:25 UTC
Git commit 0772bc4d53487eeb36b160b8d122d0b21be45714 by Maik Qualmann.
Committed on 19/02/2022 at 07:02.
Pushed by mqualmann into branch 'master'.

backport from qt5-maintenance to enable logging optional

M  +1    -1    core/app/main/main.cpp
M  +5    -10   core/app/utils/digikam_debug.cpp
M  +1    -1    core/app/utils/digikam_debug.h
M  +1    -1    core/showfoto/main/main.cpp
M  +3    -0    core/utilities/setup/system/systemsettings.cpp
M  +2    -0    core/utilities/setup/system/systemsettings.h
M  +7    -1    core/utilities/setup/system/systemsettingswidget.cpp

https://invent.kde.org/graphics/digikam/commit/0772bc4d53487eeb36b160b8d122d0b21be45714
Comment 17 kdeb 2022-02-19 13:00:25 UTC
Sorry guys. Guess I should have tried this sooner. 

I ran digikam with sysinternals dbgview and it logged a few hundred lines. 

TIFFPrintDirectory(tif, stdout, 0);

Is going to stdout and QT under windows somehow doesn't do stdout. Searching about QT and stdout I read a QT application will 'do' stdout when directed to a file so I ran digikam 7.5 from the command line with stdout directed to a file. In digikam I full screen viewed a 266MB tiff file produced by photoshop and quit. 

The redirected stdout file is 2.8GB. I have trouble viewing this file, probably because it only has 775 lines some of which must be 100s of MB long. 

I do see this for example

  PhotoshopLayerData: 0x41,0x64,0x6f,0x62,0x65,0x20,0x50,0x68,0x6f,0x74,0x6f,0x73,0x68,0x6f,0x70,0x20,0x44,0x6f,0x63,0x70x0,0xc2,0x0,0xc2,

So it looks like TIFFPrintDirectory(tif, stdout, 0); is dumping all or large parts of the tiff in comma separated hex.  TIFFPrintDirectory is the real problem not debug logging.
Comment 18 Maik Qualmann 2022-02-20 14:36:05 UTC
We'll see if the problem persists with the next bundle. This will have debug disabled by default unless an environment variable is set.
This code should no longer be executed in the TIFF Loader:

https://invent.kde.org/graphics/digikam/-/blob/qt5-maintenance/core/dplugins/dimg/tiff/dimgtiffloader_load.cpp#L91

Maik