Bug 274251 - Nepomuk cannot index specific mp3 file and cause cpu 100%
Summary: Nepomuk cannot index specific mp3 file and cause cpu 100%
Status: RESOLVED UPSTREAM
Alias: None
Product: nepomuk
Classification: Unmaintained
Component: general (show other bugs)
Version: unspecified
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: Sebastian Trueg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-27 06:28 UTC by Weng Xuetian
Modified: 2012-03-10 15:11 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Bug mp3 file. (999.02 KB, application/octet-stream)
2011-05-27 06:29 UTC, Weng Xuetian
Details
A walkaround for this bug (1.84 KB, patch)
2011-11-25 20:47 UTC, Weng Xuetian
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Weng Xuetian 2011-05-27 06:28:36 UTC
Version:           unspecified (using Devel) 
OS:                Linux

As long as I test, some specific mp3 file on my system will cause this problem. I attach the test file for test.

Reproducible: Always

Steps to Reproduce:
1. Create a folder
2. Place attached mp3 in it
3. Make nepomuk only index it (for quick)

Actual Results:  
Index this file without problem.

Expected Results:  
Nepomukindexer will take 100% cpu and never ends.
Comment 1 Weng Xuetian 2011-05-27 06:29:43 UTC
Created attachment 60371 [details]
Bug mp3 file.

Actually It works will in 4.6.3.
Comment 2 André Fettouhi 2011-09-09 18:33:55 UTC
I can confirm this bug on Arch Linux 64 bit running KDE 4.7.1. Indexing mp3 files causes nepomukindexer at random mp3 files to spike the cpu. Looking at top I see 3 or more etries all of a sudden of nepomukindexer taking 99 or 100 % cpu power.
Comment 3 Bearsh 2011-10-04 08:15:17 UTC
The same also happens on my Kubuntu installation, KDE SC 4.7.1, 64bit
Comment 4 Weng Xuetian 2011-10-04 08:55:26 UTC
The buggy file in #2 still trigger this.

The backtrace while I attach the nepomukindexer.

#0  0x00007f78866f2b17 in access () from /lib/libc.so.6
#1  0x00007f78874a74b1 in KStandardDirs::exists (fullPath=...)
    at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/kernel/kstandarddirs.cpp:591
#2  0x00007f78874a77d8 in KStandardDirs::realPath (dirname=<value optimized out>)
    at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/kernel/kstandarddirs.cpp:942
#3  0x00007f78874ac5c0 in KStandardDirs::KStandardDirsPrivate::resourceDirs (this=0x246b8c0, type=0x7f78875a6309 "xdgdata-mime", 
    subdirForRestrictions=...) at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/kernel/kstandarddirs.cpp:1165
#4  0x00007f78874ae44c in KStandardDirs::findAllResources (this=<value optimized out>, type=0x7f78875a6309 "xdgdata-mime", filter=..., options=..., 
    relList=...) at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/kernel/kstandarddirs.cpp:872
#5  0x00007f78874ae682 in KStandardDirs::findAllResources (this=<value optimized out>, type=<value optimized out>, filter=<value optimized out>, 
    options=<value optimized out>) at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/kernel/kstandarddirs.cpp:897
#6  0x00007f78874c6f57 in KMimeTypeRepository::checkMimeTypes (this=<value optimized out>)
    at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/services/kmimetyperepository.cpp:80
#7  0x00007f78874c71c6 in KMimeTypeRepository::checkEssentialMimeTypes (this=0x25f0aa0)
    at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/services/kmimetyperepository.cpp:617
#8  0x00007f78874c262e in KMimeType::findByUrlHelper (_url=..., mode=0, is_local_file=true, device=0x7fff347977b0, accuracy=0x0)
    at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/services/kmimetype.cpp:184
#9  0x00007f78874c370b in KMimeType::findByUrl (url=..., mode=0, is_local_file=<value optimized out>, fast_mode=<value optimized out>, 
    accuracy=<value optimized out>) at /chakra/desktop-testing/kdelibs/src/kdelibs-4.7.1/kdecore/services/kmimetype.cpp:323
#10 0x00000000004090cc in _start ()
Comment 5 Sebastian Trueg 2011-10-04 12:52:34 UTC
Cannot reproduce with libstreamanalyzer (strigi) 0.7.6. Please make sure to have an up-to-date version since it is the actual problem.
Comment 6 André Fettouhi 2011-10-05 21:55:24 UTC
Just tried this with strigi 0.7.2 on Arch Linux 64 bit with KDE 4.7.2 and this is NOT fixed. Nepomukindexer still hangs at random mp3 files but the cpu doesn't spike as high as before (with 0.7.5).
Comment 7 Weng Xuetian 2011-10-06 02:42:04 UTC
I came from chakra and the strigi check out from git on 20110925.

http://chakra-project.org/packages/index.php?act=show&subdir=testing/i686&sortby=date&file=strigi-git-20110925-1-i686.pkg.tar.xz

May try debug later.
Comment 8 André Fettouhi 2011-10-18 21:03:12 UTC
Any progress on this because this isn't resolved.
Comment 9 Ilari Mäkimattila 2011-10-23 17:59:00 UTC
This issue still exists in KDE 4.7.2 with strigi 0.7.6. I'm using 64bit Arch Linux.
Comment 10 Sebastian Trueg 2011-10-24 10:45:09 UTC
I am working on it. Need to get some other trouble out of the way first though... please be a little more patient... I know you already have done that for quite a while... :)
Comment 11 Sebastian Trueg 2011-10-24 11:01:40 UTC
Ok, this was simpler than I thought: works perfectly here with the upcoming 4.7.3 and strigi trunk (upcoming 0.7.7). Since we only have 3 more days until 4.7.3 lets hope that the release will fix it for you, too.
Comment 12 André Fettouhi 2011-10-24 11:24:15 UTC
OK, will check for myself when KDE 4.7.3 and strigi 0.7.7 hit the Arch Linux repos.
Comment 13 Weng Xuetian 2011-10-24 17:08:48 UTC
Just grab the strigi git code and compiles, seems problem still exists.

But as the previous backtrace indicates, I'm not sure this is a strigi problem or kdelibs/kde-runtime problem.

Would wait for 4.7.3.
Comment 14 Weng Xuetian 2011-10-30 06:51:20 UTC
Just upgrade to KDE 4.7.3 Chakra... no luck.
Qt 4.7.4.

And strigi from anongit.kde.org (is this the correct place?)

Try debug a lot... seems after strigi index, the memory is already mess up.. like toLocal8bit cannot work use the correct codec. Though valgrind doesn't show something interesting.
Comment 15 Sebastian Trueg 2011-11-02 07:58:30 UTC
I am not sure what to do here now since I cannot reproduce the problem.... :/
Comment 16 Weng Xuetian 2011-11-02 10:59:17 UTC
emm.. is it correct to get the strigi code from anongit.kde.org?
Comment 17 Sebastian Trueg 2011-11-02 11:16:35 UTC
(In reply to comment #16)
> emm.. is it correct to get the strigi code from anongit.kde.org?

Yes, but be aware that the strigi repo is just a wrapper for a set of submodules. So unless you fetch those you do not have the updated libs. I recommend to instead clone the "libstreams" and "libstreamanalyzer" repos.
Comment 18 Ilari Mäkimattila 2011-11-04 21:13:01 UTC
Still exists in KDE 4.7.3 in Arch Linux. Strigi is still 0.7.6 though. It seems like every mp3 that hangs the indexer is encoded as variable bit rate, have ID3v1.1 and ID3v2.4.0 tags and are Joint Stereo. Files with ID3v2.3.0 tags are indexed correctly.

`file` says "Audio file with ID3 version 2.4.0, contains: MPEG ADTS, layer III, v1, XXX kbps, 44.1 kHz, JntStereo" for each file.
Every file seems to have the ID3v2 number of tracks in the album field set.
Comment 19 Ilari Mäkimattila 2011-11-04 21:19:33 UTC
Converting the ID3v2.4.0 tag to 2.3.0 using Kid3 solves the problem and the files stop hanging the indexer.
Comment 20 Sebastian Trueg 2011-11-05 12:43:52 UTC
I am fairly certain this will be fixed with strigi 0.7.7.
Comment 21 Weng Xuetian 2011-11-23 21:26:01 UTC
No luck, just grab strigi today's git code, with KDE 4.8 beta 1.

If I follow #19 comment, no more hang in nepomukindexer, but with this error ..

nepomukindexer(5731)/nepomuk (strigi service) Nepomuk::StrigiIndexWriter::finishAnalysis: "Cannot set values for abstract property 'http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#setSize'."
Comment 22 Sebastian Trueg 2011-11-24 12:19:15 UTC
(In reply to comment #21)
> No luck, just grab strigi today's git code, with KDE 4.8 beta 1.
> 
> If I follow #19 comment, no more hang in nepomukindexer, but with this error ..
> 
> nepomukindexer(5731)/nepomuk (strigi service)
> Nepomuk::StrigiIndexWriter::finishAnalysis: "Cannot set values for abstract
> property 'http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#setSize'."

Please update shared-desktop-ontologies to 0.8.0.
Comment 23 Weng Xuetian 2011-11-25 06:14:39 UTC
But still the problem exists, the ID3v2 2.4 index is break.
Comment 24 Sebastian Trueg 2011-11-25 12:57:27 UTC
(In reply to comment #23)
> But still the problem exists, the ID3v2 2.4 index is break.

You have updated sdo now?
Did you install it locally or globally?
Comment 25 Weng Xuetian 2011-11-25 14:04:03 UTC
Well, the original problem of this bug report is .. the attached mp3 file cannot be indexed (nepomukindexer hang and take one core).

#19 suggest to convert id3v2 2.4 tag to 2.3, so if I convert to id3v2 2.3, the hang problem solved, but that didn't resolve the original problem: nepomukindexer hang with id3v2 2.4....

No matter I update s-d-o or not (actually I update it, the hang problem is still there), that's nothing to do with v2.4 tag.
Comment 26 Sebastian Trueg 2011-11-25 16:13:47 UTC
Seems that I did speak prematurely: actually indexing did not work here either. However, it did not hang but simply terminated with an error. This led to a fix in shared-desktop-ontologies[1]. I just released 0.8.1.

[1] https://sourceforge.net/projects/oscaf/
Comment 27 Weng Xuetian 2011-11-25 16:28:24 UTC
I root down the cause for hang for me.

After a call to UTF8Converter with iconv, the QTextCodec::codecForLocale cannot work correctly.

But all the call to iconv seems to use correctly, I wonder is it a glibc bug or qt bug...

I have glibc 2.14.1 here.
Comment 28 Weng Xuetian 2011-11-25 16:47:50 UTC
This is my test case (already strigi free).

http://pastebin.com/6590jvAi

The last qDebug() << 111111; will output wrong (also for QString)

Which part should I report the bug?.... Qt / glibc ...
Comment 29 Sebastian Trueg 2011-11-25 16:55:44 UTC
(In reply to comment #28)
> This is my test case (already strigi free).
> 
> http://pastebin.com/6590jvAi
> 
> The last qDebug() << 111111; will output wrong (also for QString)
> 
> Which part should I report the bug?.... Qt / glibc ...

To be honest I do not know.
Comment 30 Weng Xuetian 2011-11-25 20:47:29 UTC
Created attachment 66082 [details]
A walkaround for this bug

Don't know why, but iconv UTF-16 will cause this bug, at least for Chakra and Archlinux (they both use glibc 2.14.1), not tested on other distribution, anyway we can detect the endian by hand, and seems it can fix this problem.
Comment 31 Weng Xuetian 2011-11-25 21:34:27 UTC
Ok, the reason is also found. iconv_open with glibc 2.14 has a bug (maybe not a bug, but behaviour is different from glibc 2.13) that will remember the endianness even another iconv_open is used. So this walkaround is required for all 2.14+ glibc.

I put a upstream report here.
http://sourceware.org/bugzilla/show_bug.cgi?id=13439
Comment 32 Weng Xuetian 2011-11-26 09:22:47 UTC
This is the error I get with s-d-o 0.8.1 (with upper patched git strigi).

Nepomuk::StrigiIndexWriter::finishAnalysis: "<http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#setNumber> has a rdfs:domain of <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#MusicPiece>. <_:b> only has the following types <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#MusicAlbum>, <http://www.w3.org/2000/01/rdf-schema#Resource>"
Comment 33 Weng Xuetian 2011-11-26 11:02:12 UTC
Ok, #32 problem is resolved with git libstreamanalzyer, just noticed.
Not very familiar with git submodule...
Comment 34 André Fettouhi 2011-12-16 11:35:42 UTC
I just installed strigi 0.7.7 on my Arch 64 machine and I have shared-desktop-ontologies 0.8.1 and my mp3 files still hang...
Comment 35 Weng Xuetian 2012-01-05 05:39:35 UTC
(In reply to comment #34)
> I just installed strigi 0.7.7 on my Arch 64 machine and I have
> shared-desktop-ontologies 0.8.1 and my mp3 files still hang...

glibc 2.15 will hopefully resolve this problem.
Comment 36 Will Stephenson 2012-03-10 15:11:44 UTC
Awesome debugging and you beat the Drepper, Weng!