Bug 292631 - Strigi reindexes everything at restart
Summary: Strigi reindexes everything at restart
Status: RESOLVED FIXED
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: fileindexer (show other bugs)
Version: 4.7
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Nepomuk Bugs Coordination
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-27 21:32 UTC by cordawyn
Modified: 2012-12-27 08:29 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description cordawyn 2012-01-27 21:32:14 UTC
Version:           4.7 (using KDE 4.7.4) 
OS:                Linux

Whenever I reboot the computer, Strigi starts reindexing everything, regardless whether it finished this process before or not.
I have both nepomuk and strigi enabled, but this occurs only if I have strigi switched on, so I guess nepomuk is OK.
This issue has been plaguing me since KDE 4.6, but all reports get closed/resolved, failing to pin down the source of the problem, apparently.

Reproducible: Always

Steps to Reproduce:
1) Enable Desktop Search for KDE, make sure both nepomuk and strigi are switched on.
2) (Optionally) Wait until Strigi finishes indexing files.
3) Reboot the computer.
4) Strigi starts the reindexing anew.

Actual Results:  
Constant CPU load (not 100%, but still annoying and hard on the CPU fans)

Expected Results:  
Strigi should only reindex the updated files/data.

dpkg -l:

ii  kdegraphics-strigi-analyzer       4:4.7.3-0ubuntu0.1
ii  kdepim-strigi-plugins             4:4.7.4+git111222-0ubuntu0.1
ii  libnepomuk4                       4:4.7.4-0ubuntu0.1
...
Comment 1 Christoph Feck 2012-01-27 22:25:03 UTC
It is possible the indexer is only setting modification watches on all folders. Are you sure the actual contents is re-indexed?
Comment 2 cordawyn 2012-01-28 07:24:11 UTC
(In reply to comment #1)
> It is possible the indexer is only setting modification watches on all folders.
> Are you sure the actual contents is re-indexed?

No, I'm not sure - I just see the tray icon say: "Strigi is indexing files..." and there are files shown there in a sequence. And this is very CPU intensive, I should say.
Is there a way to tell the difference between indexing and setting the watches? (Without going into the source code, of course.)
Comment 3 JP Valdes 2012-02-03 12:12:20 UTC
In my case, I am positive the indexer is reindexing some .tif, .mp3, .flac files. It looks like is choking on those files as the use of the cpu by the process virtuoso-t is very high.

Any info I could provide?
Comment 4 JP Valdes 2012-02-13 17:57:16 UTC
I have taken a look at .xsession-errors and there are lots of messages from nepomukservicestub. For example, these messages seem related to the reindexing of tiff files:
[/usr/bin/nepomukservicestub] TIFFReadDirectory:
[/usr/bin/nepomukservicestub] Warning,
[/usr/bin/nepomukservicestub] au_coll_3-3.tif: unknown field with tag 34118 (0x8546) encountered
[/usr/bin/nepomukservicestub] .
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(1183)" Soprano: "Invalid argument (1)": "<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#width> has a rdfs:domain of <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Visual>. <_:dog> only has the following types <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#InformationElement>, <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject>, <http://www.w3.org/2000/01/rdf-schema#Resource>"
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(1183)" Soprano: "Invalid argument (1)": "<http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#width> has a rdfs:domain of <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Visual>. <_:dog> only has the following types <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#InformationElement>, <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject>, <http://www.w3.org/2000/01/rdf-schema#Resource>"

I am running KDE 4.8 in ArchLinux (so I figured I have the last version of all relevant packages).
Comment 5 Łukasz Sierżęga 2012-02-13 20:13:38 UTC
I confirm that bug on kde 4.8 too.
Comment 6 JP Valdes 2012-02-13 20:56:52 UTC
Furthermore, when (re-)indexing mp3's and flac's, messages like the following appear in .xsession-errors:
"/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Failed to convert '2007-11-11' to literal of type 'http://www.w3.org/2001/XMLSchema#dateTime'."
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Failed to convert '2007-11-11' to literal of type 'http://www.w3.org/2001/XMLSchema#dateTime'."
[/usr/bin/nepomukservicestub] QDateTime Soprano::DateTime::fromDateTimeString(const QString&)  invalid formatted datetime string:  "2001-09" 

"/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Failed to convert '2001-09' to literal of type 'http://www.w3.org/2001/XMLSchema#dateTime'."
"/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Failed to convert '2001-09' to literal of type 'http://www.w3.org/2001/XMLSchema#dateTime'."
[/usr/bin/nepomukservicestub] QDateTime Soprano::DateTime::fromDateTimeString(const QString&)  invalid formatted datetime string:  "2001-09" 

"/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Failed to convert '2001-09' to literal of type 'http://www.w3.org/2001/XMLSchema#dateTime'."
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Failed to convert '2001-09' to literal of type 'http://www.w3.org/2001/XMLSchema#dateTime'."
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Cannot set values for abstract property 'http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#albumArtist'."
"/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Cannot set values for abstract property 'http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#albumArtist'."
[/usr/bin/nepomukservicestub] "/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Cannot set values for abstract property 'http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#albumArtist'."
"/usr/bin/nepomukservicestub(7415)" Soprano: "Invalid argument (1)": "Cannot set values for abstract property 'http://www.semanticdesktop.org/ontologies/2009/02/19/nmm#albumArtist'."

And many music files don't appear when searching for them via dolphin or krunner neither they appear in Bangarang.

I don't know if any info I wrote is useful so I'll stop "spamming" this bug.
Comment 7 cordawyn 2012-04-24 21:50:16 UTC
With the latest KDE and nepomuk, I've been getting frequent OS reboots as the CPU temperature rose above 100C (it's tuned to reboot/shut down in such critical cases to prevent CPU damage). I had to switch off the "semantic search" stuff because of that :-(
Comment 8 Vishesh Handa 2012-12-27 08:29:54 UTC
This was a problem with Strigi providing incorrect values and Nepomuk not saving them. With 4.10, we have our own indexer, so this is no longer a problem.