Bug 195583

Summary: FILEIO : read header metadata from JPEG2000 files with Exiv2
Product: [Applications] digikam Reporter: Richard Ash <richard>
Component: Plugin-DImg-JP2KAssignee: Digikam Developers <digikam-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: ahuggel, caulier.gilles, marcel.wiesweg, presse.wissen
Priority: NOR    
Version: 0.10.0   
Target Milestone: ---   
Platform: Gentoo Packages   
OS: Linux   
Latest Commit: Version Fixed In: 6.4.0
Sentry Crash Report:

Description Richard Ash 2009-06-07 18:55:10 UTC
Version:           0.10.0 (using KDE 4.2.4)
Compiler:          gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.1) 
OS:                Linux
Installed from:    Gentoo Packages

When starting digikam 0.10.0 for the first time after upgrading from 0.9.5, the progress dialogue that appears whilst it migrates settings gets almost stuck in the "Scanning images in individual albums" stage. In my case, the first few (older) albums shoot past quickly with lots of disk IO, then it seems to stop dead, using full CPU and no disk IO. After much watching and experimenting, I discover that it always stops on the first album that contains JPEG-2000 images rather than ordinary JPEG or PNG images. There are no messages sent to the console, other than ones about mal-formed images/metadata (I have some files affected by old exiv2 bugs). These messages do still occasionally pop up after it is "frozen".

Investigation with lsof reveals that it is still making progress on examining images, but has now slowed down to doing between 5 and 10 JPEG-2000 images (sizes 6-12 Mpixel) per _minute_. If I am patient, then the scan does eventually make it's way through my collection (18GB), but because the progress indicator can take 10 minutes or more between updates, the program looks hung.

I think this is a usability problem, because most people won't wait all afternoon for it to sort itself out (I'm still waiting...). This was never a problem for 0.9.5, which started quickly on the same collection / hardware.

I can confirm that ImageMagick's identify command on the same images does take about 10 seconds per image, which I presumed was because the entire image was being decoded for the colour stats.
Comment 1 caulier.gilles 2009-06-07 18:59:19 UTC
Try current implementation from svn, with First Run assistant.

Gilles Caulier
Comment 2 Marcel Wiesweg 2009-06-08 19:35:10 UTC
You seem to have debug message turned off when you say you have none. Enable 50003 in kdebugdialog.
It is unusual to need so much time to scan an image. I think usually disk I/O is the limiting factor. You have already identified pictures possibly causing this.
Please try to run the exiv2 command line utility on one of these and see if it takes long as well. Then please select one representative and provide them for testing. There may be a problem that we or Andreas from exiv2 can fix. If you have no upload space available you can send a picture to my mail account, I can upload it to our test image collection.
Comment 3 Richard Ash 2009-06-10 22:18:09 UTC
Exactly the same issue exists in current SVN if I remove the digikam4.db file so that digikam thinks I have not run a KDE4 version before, and scans the whole collection. Note that under these conditions I don't get the first run dialogue, presumably because my digikamrc file still exists.

With debugging on (for digikam and kexiv2, although the latter isn't built from SVN and doesn't seem to do much), I get this output for each JPEG-2000 file scanned:

digikam(12860)/digikam (core) Digikam::DImg::load: "/home/ra/Documents/data/pictures/DVD3/2008-09-27-Band-Party/IMG_0490.jp2"  : JPEG2000 file identified
digikam(12860)/KEXIV2 KExiv2Iface::KExiv2::getImageDateTime: DateTime => Exif.Photo.DateTimeOriginal =>  QDateTime("Sat Sep 27 20:31:10 2008")
digikam(12860)/KEXIV2 KExiv2Iface::KExiv2::getDigitizationDateTime: DateTime (Exif digitalized):  Sat Sep 27 20:31:10 2008
digikam(12860)/KEXIV2 KExiv2Iface::KExiv2::getImageOrientation: Orientation => Exif.Image.Orientation =>  1

The first message appears as soon as the image starts to be scanned, then there is a 12 second wait before the remaining lines appear, followed almost immediately by the first line for the next image. During this time the CPU is at 100% but there is no disk I/O, so this is not a problem with the time it takes to read the image from disk.

I have uploaded IMG_0490.jp2 to 
http://richardash1981.users.sourceforge.net/IMG_0490.jp2
unfortunately it's a 15MB image (separate moan about this when I work out why). 
Running exiv2 from the command line on this image is not slow:
----
$ time exiv2 IMG_0490.jp2
File name       : IMG_0490.jp2
File size       : 14789616 Bytes
MIME type       : image/jp2
Image size      : 2856 x 4290
Camera make     : Canon
Camera model    : Canon EOS 450D
Image timestamp : 2008:09:27 20:31:10
Image number    : 
Exposure time   : 1/3 s
Aperture        : F4
Exposure bias   : 0
Flash           : No, compulsory
Flash bias      : 0 EV
Focal length    : 23.0 mm
Subject distance: 0
ISO speed       : 800
Exposure mode   : Aperture priority
Metering mode   : Multi-segment
Macro mode      : Off
Image quality   : RAW
Exif Resolution : 2856 x 4290
White balance   : Auto
Thumbnail       : image/jpeg, 3234 Bytes
Copyright       : 
Exif comment    : 

real    0m0.017s
user    0m0.010s
sys     0m0.002s
----
Oddly, identify from ImageMagick is very slow:
----
$ time identify IMG_0490.jp2
IMG_0490.jp2 JP2 2856x4290 2856x4290+0+0 16-bit DirectClass 14.1mb 15.290u 0:21

real    0m20.136s
user    0m14.546s
sys     0m0.753s
----
which probably isn't relevant.

I then tried renaming my digikamrc file and all databases out of the way, so I get a clean first run of the new SVN build. Selected as pictures folder the folder I normally us, i.e. /home/ra/Documents/data/pictures/. After accepting the defaults on all subsequent screens (except I chose to do corrections on opening raw files, because I'm curious), it goes into the scan process on the folder tree, with the same results - when scanning JPEG files (sizes 1MB - 6MB) it scans so fast I can't read the text scrolling past, when it reaches JPEG-2000 it slows down to one image every 10 seconds with full CPU and almost no disk I/O.
Comment 4 caulier.gilles 2009-06-10 23:17:03 UTC
Marcel,

Why parsing JPEG2000 file is too slow ? I can reproduce it here too ?

Typically, parsing JP2 with Exiv2 must be fast (i suppose that scanning from DB use Exiv2.... I hope). No need to decode whole image. 

Also, in DImg::JP2KLoader, i can see rule to loader JP2 header to get image informations without to load full image...

So, I'm a little bit confuse here...

Gilles
Comment 5 caulier.gilles 2009-06-11 07:41:40 UTC
Marcel,

Why DImg JPEG2000 image loader is used at first run :

digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/2-RGBA-test-8bits.jp2"                                                                                                       
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/3-RGBA_test-16bits.jp2"  : JPEG2000 file identified                                                                                                      
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/3-RGBA_test-16bits.jp2"                                                                                                      
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/7-16bits+alpha-photo.jp2"  : JPEG2000 file identified                                                                                                    
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/7-16bits+alpha-photo.jp2"                                                                                                    
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/grayscale.jp2"  : JPEG2000 file identified                                                                                                               
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/grayscale.jp2"                                                                                                               
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/IMG_2924_cap_one-0.jp2"  : JPEG2000 file identified                                                                                                      
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/IMG_2924_cap_one-0.jp2"                                                                                                      
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/index256.jp2"  : JPEG2000 file identified                                                                                                                
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/index256.jp2"                                                                                                                
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/Otoe_Relief8.jp2"  : JPEG2000 file identified                                                                                                            
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/Otoe_Relief8.jp2"                                                                                                            
digikam(18787)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/photo-dimage-0.jp2"  : JPEG2000 file identified                                                                                                          
digikam(18787)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/photo-dimage-0.jp2"  

... And take a while to parse images ...

Gilles Caulier
Comment 6 caulier.gilles 2009-06-11 08:03:56 UTC
Marcel,

After to place some debug messages to JPEG2000 loader:

digikam(19367) Digikam::JP2KLoader::load: Init JPEG2000 API                                                                   
digikam(19367) Digikam::JP2KLoader::load: Check JPEG2000 Color space                                                          
digikam(19367) Digikam::JP2KLoader::load: Scan JPEG2000 geometry                                                              
digikam(19367) Digikam::JP2KLoader::load: Scan JPEG2000 depth                                                                 
digikam(19367)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/grayscale.jp2"                                                                                                               
digikam(19367)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/IMG_2924_cap_one-0.jp2"  : JPEG2000 file identified                                                                                                      
digikam(19367) Digikam::JP2KLoader::load: Init JPEG2000 API                                                                   
digikam(19367) Digikam::JP2KLoader::load: Check JPEG2000 Color space                                                          
digikam(19367) Digikam::JP2KLoader::load: Scan JPEG2000 geometry                                                              
digikam(19367) Digikam::JP2KLoader::load: Scan JPEG2000 depth                                                                 
digikam(19367)/digikam (core) Digikam::ImageScanner::addImage: Adding new item "/mnt/data/photos/Albums/Test Pictures/JPEG2000/IMG_2924_cap_one-0.jp2"                                                                                                      
digikam(19367)/digikam (core) Digikam::DImg::load: "/mnt/data/photos/Albums/Test Pictures/JPEG2000/index256.jp2"  : JPEG2000 file identified                                                                                                                
digikam(19367) Digikam::JP2KLoader::load: Init JPEG2000 API                                                                   
digikam(19367) Digikam::JP2KLoader::load: Check JPEG2000 Color space                                                          
digikam(19367) Digikam::JP2KLoader::load: Scan JPEG2000 geometry                                                              
digikam(19367) Digikam::JP2KLoader::load: Scan JPEG2000 depth  

... I can see that Init JPEG 2000 API take a while... 


If i contunue my investiguations, i can see :

digikam(19606) Digikam::JP2KLoader::load: Init JPEG2000 API
digikam(19606) Digikam::JP2KLoader::load: Init JPEG2000 API: open stream
digikam(19606) Digikam::JP2KLoader::load: Init JPEG2000 API: decode
digikam(19606) Digikam::JP2KLoader::load: Init JPEG2000 API: close stream

... that Init JPEG2000 API: decode is the problem... here :

http://lxr.kde.org/source/extragear/graphics/digikam/libs/dimg/loaders/jp2kloader.cpp#120

Gilles
Comment 7 Marcel Wiesweg 2009-06-11 11:45:26 UTC
It seems the whole image is decoded?
The idea is that if LoadImageData is not set, only infos about image size and colorspace is extracted by reading the file header. With libjpeg, libpng, libtiff this is possible without decoding the whole image. It's usually very fast, I assume the file header is read by exiv2 anyway so no further disk access.
I dont find the API docs for libjasper anyway, but is this not possible with this library?
Comment 8 caulier.gilles 2009-06-11 12:04:57 UTC
Marcel,

It sound like it's not possible with Jasper (weird)

Look in tool jasper code, there is a command line to see JP2 file properties. It use decode function too...

Image Magick use also this function with identify image.

Anyway, the ultimate solution is to always use Exiv2 to scan image: this will decrease start up time... 

Currently, Exiv2 can only get 2 common image properties outside Exif, Iptc, and Xmp : image width and image height. Andreas as planed to add more properties as color depth, color space, compression ratio, etc... when it's available (depending of file format of course).

Andreas, please fix me if i'm wrong here...

Gilles
Comment 9 Marcel Wiesweg 2009-06-11 12:24:33 UTC
We read currently read width, height, color space and color depth from the information by the image loaders, and use DImg to calculate the file hash.

(I have a callgrind profile here that I took when importing pictures. This profile gives me headaches (some methods take 186% of the time??), but DImg::load for JPEGs only takes 0.29%. This is CPU, without disk access, but I assume, as exiv2 will read the file anyway, it is in the OS's disk access cache and disk reading is no limiting factor.)

To fix the profile at hand, I think, we must return from the jp2k loader's code early if LoadImageData is not set. For color space and bit depth, if it cannot be retrieved, we must return (wrong) standard values. It is currently not possible to get this information from exiv2 as I understand.
Btw, I think there is the same problem in the QImage loader.
Comment 10 Andreas Huggel 2009-06-11 14:00:40 UTC
> Andreas as planed to add more properties as
> color depth, color space, compression ratio, etc... when it's available
> (depending of file format of course).

You're referring to http://dev.exiv2.org/issues/show/505 I presume. It is still quite a long way until we get there.

Exiv2 typically only reads what it needs (metadata only, no image data). I had a quick look at the code to read JP2 images just now, there might be potential for improvement if Exiv2 parsing is slow.

Andreas
Comment 11 caulier.gilles 2009-06-11 14:16:46 UTC
Andreas,

No, as Marcel said, in digiKam we use internal image loaders to get image informations outside XMP, Exif, IPTC.

We don't use Exiv2 yet for this task. This is why digiKam >= 0.10.0 is slow at first run. There is no optimization to do in Exiv2 to parse JP2.

Using Exiv2 at this place will speed up first run. 

Gilles Caulier
Comment 12 Marcel Wiesweg 2009-12-07 19:16:06 UTC
SVN commit 1059937 by mwiesweg:

Disable scanning of images with libjasper.
This means that width, height, and color format will not be available.
At least width and height are available from exiv2, but Image::pixelWidth()
and pixelHeight() are not used in libkexiv2.
So for now, the dimensions are only avaiable for when there is Exif information.

CCBUG: 215458
CCBUG: 195583

 M  +17 -0     jp2kloader.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1059937
Comment 13 Marcel Wiesweg 2009-12-07 19:19:10 UTC
*** Bug 215458 has been marked as a duplicate of this bug. ***
Comment 14 Marcel Wiesweg 2009-12-07 19:20:11 UTC
Copying from 215458:

 ------- Comment #3 From  Marcel Wiesweg   2009-11-27 18:16:08  (-) [reply] -------  
Gilles, part of the problem is that we load the full JPEG200 image when
scanning, because libjasper does not support just reading the header. My
proposed short-term solution is to _not_ load the whole image, instead do not
scan color format and rely on exiv2 to retrieve width and height for JPEG2k
images.

Long term we should evaluate OpenJPEG, I dont know about the status of this
project, but I read Krita's Cyrille Berger is developing a loader based on this
library.

 ------- Comment #4 From  Gilles Caulier   2009-11-27 18:22:34  (-) [reply] -------  
I'm agree to not load full JP2 image during scanning.

OpenJPEG is not the only way to solve this problem. In Exiv2 a future new image
info container will be created to get physical informations about the image

JP2 is already supported by Exiv2. I know... I have writen this image format
support (:=)))

Another very important information about Exiv2 is the availablity of image
width and height info. These properties can be already extracted using Exiv2
(look Exiv2::Image method). Libkexiv2 must be adapted as well. It's simple to
do...

About OpenJPEG, i know that libray is lesser fast than Jasper, but tlike this
last one sound like unmainted, OpenJPEG must be the right way to use for the
far future...
Comment 15 caulier.gilles 2011-12-13 13:34:04 UTC
Richard,

This file still valid with 2.x serie ?

Gilles Caulier
Comment 16 caulier.gilles 2014-08-28 12:51:43 UTC
I rename this file title and point to relevant source code future change :

https://projects.kde.org/projects/extragear/graphics/digikam/repository/revisions/master/entry/libs/dimg/loaders/jp2kloader.cpp#L100

Gilles Caulier
Comment 17 caulier.gilles 2019-10-05 23:27:19 UTC
Exiv2 support JPEG2000 metadata since a while. 
I close this file now.

Gilles Caulier