Bug 315678 - FileIndexer is looping on certain files, never ending
Summary: FileIndexer is looping on certain files, never ending
Status: RESOLVED FIXED
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: fileindexer (show other bugs)
Version: 4.10.0
Platform: Ubuntu Linux
: NOR major
Target Milestone: ---
Assignee: Nepomuk Bugs Coordination
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-23 10:47 UTC by Blackpaw
Modified: 2013-02-24 05:38 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In: 4.10.1


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Blackpaw 2013-02-23 10:47:29 UTC
Fileindexer was never finishing on my system, continually indexing when the system was idle - literally running for days consuming 100% of a random core, before killing the system via a CPU overheat.

When I displayed the Nepomuk Controller I noticed it was just looping over and over (very rapidly) on 5 media files. Once I removed those files to an unmonitored directory the behaviour stopped. If I made even one of the files avaible for indexing it would just loop endlessly on it.

When I started nepomukserver from the command line I saw some SQL errors which I have attached.

I can make one of the files in question availble, it is a 600 KB mp3 file.



Reproducible: Always

Steps to Reproduce:
1. Stop nepomuk
2. Place on of the problem files in a directory that will be indexed.
3. Start nepomuk server
4. Let the system go idle
5. Eventually nepomuk will loop over and over failing to index the file.
Actual Results:  
Nepomuk will loop over and over failing to index the file.

Expected Results:  
Nepomuk would index the file and stop.

Command line output from nepomukserver. I can do more tests as required.

[/usr/bin/nepomukservicestub] nepomukstorage(3191)/nepomuk (storage service) Nepomuk2::Sync::ResourceIdentifier::runIdentification: KUrl("_:a")  -->  KUrl("nepomuk:/res/0d1c0d06-8bc2-4927-8fcf-97cb1dd3eb78")
[/usr/bin/nepomukservicestub] nepomukstorage(3191)/nepomuk (storage service) Nepomuk2::DataManagementModel::storeResources:  MERGING FAILED!
nepomukstorage(3191)/nepomuk (storage service) Nepomuk2::DataManagementModel::storeResources: Setting error! "SQLExecDirect failed on query 'sparql insert into <nepomuk:/ctx/b143b53b-2ecb-4da9-af75-2dd72531c306> {  <nepomuk:/res/6edf8129-c5e7-48bf-9ee2-734366527e28> <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#title> "Mythbusters"^^<http://www.w3.org/2001/XMLSchema#string> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#averageBitrate> "6.4000000000e+01"^^<http://www.w3.org/2001/XMLSchema#float> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#sampleRate> "2.2050000000e+04"^^<http://www.w3.org/2001/XMLSchema#float> ; <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#MusicPiece> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#channels> "1"^^<http://www.w3.org/2001/XMLSchema#integer> ; <http://www.semanticdesktop.org/ontologies/2007/08/15/nao#lastModified> "2013-02-23T10:21:08.883Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ; <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#musicAlbum> <nepomuk:/res/0d1c0d06-8bc2-4927-8fcf-97cb1dd3eb78> ; <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#contentCreated> "-0001-02-23T10:21:08.87Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ; <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#trackNumber> "-1"^^<http://www.w3.org/2001/XMLSchema#integer> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#duration> "77"^^<http://www.w3.org/2001/XMLSchema#duration> . }' (iODBC Error: [OpenLink][Virtuoso iODBC Driver][Virtuoso Server]SQ074: Line 1: DT006: Cannot convert -0001-02-23T10:21:08.87Z to datetime : Incorrect month field length)"
[/usr/bin/nepomukservicestub] nepomukindexer(3513)/nepomuk (strigi service): SimpleIndexerError:  "SQLExecDirect failed on query 'sparql insert into <nepomuk:/ctx/b143b53b-2ecb-4da9-af75-2dd72531c306> {  <nepomuk:/res/6edf8129-c5e7-48bf-9ee2-734366527e28> <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#title> "Mythbusters"^^<http://www.w3.org/2001/XMLSchema#string> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#averageBitrate> "6.4000000000e+01"^^<http://www.w3.org/2001/XMLSchema#float> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#sampleRate> "2.2050000000e+04"^^<http://www.w3.org/2001/XMLSchema#float> ; <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#MusicPiece> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#channels> "1"^^<http://www.w3.org/2001/XMLSchema#integer> ; <http://www.semanticdesktop.org/ontologies/2007/08/15/nao#lastModified> "2013-02-23T10:21:08.883Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ; <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#musicAlbum> <nepomuk:/res/0d1c0d06-8bc2-4927-8fcf-97cb1dd3eb78> ; <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#contentCreated> "-0001-02-23T10:21:08.87Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ; <http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#trackNumber> "-1"^^<http://www.w3.org/2001/XMLSchema#integer> ; <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#duration> "77"^^<http://www.w3.org/2001/XMLSchema#duration> . }' (iODBC Error: [OpenLink][Virtuoso iODBC Driver][Virtuoso Server]SQ074: Line 1: DT006: Cannot convert -0001-02-23T10:21:08.87Z to datetime : Incorrect month field length)"
Comment 1 Vishesh Handa 2013-02-23 19:21:05 UTC
This is a larger issue where Nepomuk should ignore files which cannot be indexed. That will be fixed, probably.

However this issue has been fixed in 4.10.1
Comment 2 Blackpaw 2013-02-23 20:19:29 UTC
In reply to comment #1)
> However this issue has been fixed in 4.10.1

Good to know.

> This is a larger issue where Nepomuk should ignore files which cannot be
> indexed. That will be fixed, probably.

I disagree - at the very least any file can be indexed, by its basic attributes of name, timestamp etc. It would be a very bad thing to exclude any file from the index just because indexing its meta attributes throws an exception. Rather the indexer should gracefully recover and just index the file but what it already has e.g its name.
Comment 3 Vishesh Handa 2013-02-23 20:22:49 UTC
(In reply to comment #2)
> In reply to comment #1)
> > However this issue has been fixed in 4.10.1
> 
> Good to know.
> 
> > This is a larger issue where Nepomuk should ignore files which cannot be
> > indexed. That will be fixed, probably.
> 
> I disagree - at the very least any file can be indexed, by its basic
> attributes of name, timestamp etc. It would be a very bad thing to exclude
> any file from the index just because indexing its meta attributes throws an
> exception. Rather the indexer should gracefully recover and just index the
> file but what it already has e.g its name.

I should have explained properly. The basic properties of the file such as name, mimetype, etc, are always indexed no matter what. If a file's extra attributes such as this mp3 metadata cannot be indexed, it should then be ignored, and these extra metadata attributes should not be fetched.

The basic attributes will always be fetched.
Comment 4 Blackpaw 2013-02-23 21:03:24 UTC
(In reply to comment #3)

> The basic attributes will always be fetched.

Oh excellent, thanks.

> I should have explained properly. The basic properties of the file such as
> name, mimetype, etc, are always indexed no matter what. If a file's extra
> attributes such as this mp3 metadata cannot be indexed, it should then be
> ignored, and these extra metadata attributes should not be fetched.


Is this done already (in trunk?) or on the to do list :).
Comment 5 Vishesh Handa 2013-02-24 05:22:34 UTC
(In reply to comment #4)
> (In reply to comment #3)
> 
> > The basic attributes will always be fetched.
> 
> Oh excellent, thanks.
> 
> > I should have explained properly. The basic properties of the file such as
> > name, mimetype, etc, are always indexed no matter what. If a file's extra
> > attributes such as this mp3 metadata cannot be indexed, it should then be
> > ignored, and these extra metadata attributes should not be fetched.
> 
> 
> Is this done already (in trunk?) or on the to do list :).

It's in 4.10. The only small bug is that if the extra metadata is faulty it keeps trying to fetch it. I will fix that for 4.10.1
Comment 6 Blackpaw 2013-02-24 05:38:36 UTC
(In reply to comment #5)

> It's in 4.10. The only small bug is that if the extra metadata is faulty it
> keeps trying to fetch it. I will fix that for 4.10.1

The meta data parsing bug was in 4.10 for me.