Bug 250776 - all files get removed from index, size of nepomuks memory doesn't change
Summary: all files get removed from index, size of nepomuks memory doesn't change
Status: RESOLVED FIXED
Alias: None
Product: nepomuk
Classification: Miscellaneous
Component: general (show other bugs)
Version: 4.1
Platform: Ubuntu Linux
: NOR major
Target Milestone: ---
Assignee: Sebastian Trueg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-10 13:48 UTC by Daniel Boff
Modified: 2013-06-10 16:58 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
backlog of IRC discussion (10.07 KB, text/plain)
2010-12-21 15:31 UTC, Elias Probst
Details
Deletion Blacklist filtering (1.82 KB, patch)
2011-01-03 02:54 UTC, Vishesh Handa
Details
Deletion Blacklist filtering (2.27 KB, patch)
2011-01-03 03:18 UTC, Vishesh Handa
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Boff 2010-09-10 13:48:56 UTC
Version:           4.1 (using KDE 4.5.0) 
OS:                Linux

Since Kde 4.5 nepomuk is working for me, and it's awesome. Thats why I used it for tagging some files, for instance "unread" for files I have downloaded, but not read yet. "todo" for docs I have to finish writing and so on.
It worked for about 1 month now, and yesterday the surprise: All my tags have disappeared from my files.
Nepomuk tells me (click on the icon in the systray), that there are no files in the index, however the memory usage is still 700mb.

A reboot makes nepomuk reinit the index and all files are in the index again, memory usage stays the same... However all tags are still gone.
Maybe nepomuk should have some kind of backup mechanism. it's really frustrating that all my tags are gone now

Reproducible: Couldn't Reproduce

Steps to Reproduce:
I have no idea what I have done to get this result. Maybe I logged out and in again to quickly (3-4 times)

Actual Results:  
all tags are gone, nepomuk/strigi rebuilds the whole index
Comment 1 Elias Probst 2010-12-21 00:27:27 UTC
I found the same behaviour here and tracked it down a little bit:

I have a very large music collection residing on my fileserver and using 'bangarang' to access it.
As bangarang is completely based on nepomuk I wondered why every now and then all information in bangarang was completely gone.

After some digging into nepomuk + nepomukserver, I found out, what happens.

- I'm running autofs
- autofs provides it's mountpoints under /var/autofs
- The NFS share of my fileserver is mounted trough the 'media'-map of autofs to /var/autofs/media
- I have a symlink '/media/Multimedia → /var/autofs/media/Multimedia'
- Nepomuk/Strigi is configured to index files in '/media/Multimedia'

When being now at a place (office, on the road) where autofs isn't able to mount the NFS share and Nepomuk/Strigi runs the indexer again, it doesn't find any files in /media/Multimedia, as /media/Multimedia is now just a dead symlink pointing to NULL.

When everything works correct, the files residing on /media/Multimedia should be marked as 'files on removable media'.

I suppose Nepomuk/Strigi doesn't do a readlink() on the path of files before comparing the filepath with the mounts in /etc/mtab, this way, only a symlink path with no match in /etc/mtab is found.
Comment 2 Elias Probst 2010-12-21 15:31:49 UTC
Created attachment 55128 [details]
backlog of IRC discussion

A lengthy discussion in #nepomuk-kde took place (thanks to phreedom + vHanda) where we talked about this problem and solutions.

The backlog is attached.
Comment 3 Elias Probst 2010-12-21 19:30:46 UTC
A workaround which could be implemented in 4.6 (as the suggested changes surely will not make it until 4.6) would be:

Provide a directory blacklist which could be configured by advanced users in nepomukserverrc like this:

[main Settings]
Deletion Blacklist=/media/Multimedia,/foo/bar

All files whose path contains one of the blacklist entries will not be deleted when the cleaner runs.
This way, Nepomuk would be usable again for people affected by this bug - at the moment it isn't usable at all for me, as 95% of my data reside on a fileserver and only a minority of my files actually reside on my laptop.
Comment 4 Vishesh Handa 2011-01-03 02:54:53 UTC
Created attachment 55495 [details]
Deletion Blacklist filtering

This should fix this bug temporarily as described by eliasp. I haven't tested it out.

@trueg: Please let me know if you're okay with this.

@eliasp:
If this patch gets accepted you would have to modify your nepomukserverrc and add the following lines

[deletion]
Deletion Blacklist=<list of directories>
Comment 5 Vishesh Handa 2011-01-03 03:18:54 UTC
Created attachment 55496 [details]
Deletion Blacklist filtering

I tested the patch out, and made some minor modifications. It gets applied in kdebase/runtime/nepomuk/services/filewatch/

The required syntax in nepomukserverrc is -
[Service-nepomukfilewatch]
Deletion Blacklist[$e]=/home/vishesh/birthday pics,/media/Multimedia/
Comment 6 Elias Probst 2011-01-03 09:26:41 UTC
@vHanda: Thanks a lot for implementing this (shouldn't you be learning for your exams? :) )!

Maybe add the # of this bug in the comment of the patch. This would make further development (removing workaround, implementing the real solution) easier when not having to search for this bug again.
Comment 7 Vishesh Handa 2011-01-03 19:54:37 UTC
SVN commit 1211346 by vhanda:

Temporary workaround to make Nepomuk work with Network Attached Storages. 

There is now a list of blacklisted urls which will not be checked in the Invalid-Resource-File-Cleaner. 
The required config in nepomukserverrc is - 

[Service-nepomukfilewatch]
Deletion Blacklist[$e]=/home/v/,/media/juil/

BUG: 250776



 M  +25 -2     invalidfileresourcecleaner.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1211346
Comment 8 Vishesh Handa 2011-01-03 19:58:09 UTC
SVN commit 1211347 by vhanda:

BACKPORT:

Temporary workaround to make Nepomuk work with Network Attached Storages.

There is now a list of blacklisted urls which will not be checked in the Invalid-Resource-File-Cleaner.
The required config in nepomukserverrc is -

[Service-nepomukfilewatch]
Deletion Blacklist[$e]=/home/v/,/media/juil/

BUG: 250776


 M  +25 -2     invalidfileresourcecleaner.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1211347
Comment 9 Anders Lund 2011-02-03 21:59:39 UTC
I can NOT get this to work. I added the following to ~/.kde4/share/config/nepomukserverrc:

[Service-nepomukfilewatch]
Deletion Blacklist[$e]=/media/FREECOM HDD/

I ONCE saw data kept during reboots, at that time i hade ONE album indexed in bangarang. So I happily indexed my disk with ~500 songs, rebooted and everything was LOST.

I am running KDE 4.6 and bangarang 2.0.
Comment 10 Anders Lund 2011-02-03 22:27:20 UTC
Is the /home/<user>/ entry in the deletion blacklist required for this to work? And will that not prevent nepomuk from updating the database when I change or remove files?
Comment 11 Vishesh Handa 2011-02-04 10:54:11 UTC
Could you please give me the output for -

nepomukcmd query 'select distinct ?url where { ?r <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#url> ?url. FILTER(regex(str(?url), "file://")  && !regex(str(?url), "^file:///media/FREECOM%20HDD/")  ). }' | grep '/media/FREECOM%20HDD'
Comment 12 Anders Lund 2011-02-04 12:07:35 UTC
[anders@katja ~]$ nepomukcmd query 'select distinct ?url where { ?r
> <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#url> ?url.
> FILTER(regex(str(?url), "file://")  && !regex(str(?url),
> "^file:///media/FREECOM%20HDD/")  ). }' | grep '/media/FREECOM%20HDD'
bash: nepomukcmd: command not found

Hm, what archlinux package could contain nepomukcmd?
Comment 13 Anders Lund 2011-02-04 12:13:49 UTC
Found an alias at techbase, which produced this:

[anders@katja ~]$ nepomukcmd query 'select distinct ?url where { ?r
<http://www.semanticdesktop.org/ontologies/2007/01/19/nie#url> ?url.
FILTER(regex(str(?url), "file://")  && !regex(str(?url),
"^file:///media/FREECOM%20HDD/")  ). }' | grep '/media/FREECOM%20HDD'
Total results: 15475
Execution time: 00:00:00.87
Comment 14 Vishesh Handa 2011-02-04 13:05:23 UTC
Ok. Then as far as I'm concerned - Either the files whose metadata got deleted were not in '/media/FREECOM HDD/' directory or there is something else brewing.

Could you please - Index some files in the /media/FREECOM HDD/ directory. Restart Nepomuk ( and all applications that use Nepomuk - like Dolphin ) and check if the metadata is still there.
Comment 15 Anders Lund 2011-02-04 13:30:28 UTC
I've done the following:
* Run the nepomukcmd suggested, result 14909
* Closed all apps using nepomuk
* Stopped nepomuk from systemsettings, and starting it again
* run nepomukcmd again. First it said it could not contact soprano server, then it gave the same result as before

So it looks like the data is there, but bangarang does not see them?
Comment 16 Vishesh Handa 2011-02-04 13:35:07 UTC
After running 'nepomukcmd as suggested'. Did you index something? If you did then could you check using Dolphin if the metadata is still there? Look at the information panel when hovering/clicking on the file, depending on your settings.
Comment 17 Anders Lund 2011-02-04 13:42:49 UTC
Data is there, dolphin shows album name, genre, song name etc. Data created yesterday is there too, so it seems like it is bangarang that does not find the data.
Comment 18 Anders Lund 2011-02-04 13:50:51 UTC
Dolphin still knows the data after a reboot.
Comment 19 Andrew Lake 2011-02-07 20:42:55 UTC
Unfortunately, it appears the fix made it into trunk and the 4.5 branch (4.5.5 tag/release), but not into the 4.6 branch (4.6.0 tag/release).

See:
http://websvn.kde.org/tags/KDE/4.6.0/kdebase/runtime/nepomuk/services/filewatch/invalidfileresourcecleaner.cpp?view=log

and

http://websvn.kde.org/tags/KDE/4.5.5/kdebase/runtime/nepomuk/services/filewatch/invalidfileresourcecleaner.cpp?view=log

If it is possible to commit that fix to the 4.6 branch it before the 4.6.1 tagging that would restore the the 4.5.5 functionality.

Hope this helps!
Comment 20 Andrew Lake 2011-02-18 18:28:12 UTC
4.6.1 tagging is in 6 days so hopefully we'll be able to port this fix to the 4.6 branch before then.  If there's anything I can do to help please let me know.  If schedules and workloads are tight, I'm happy to dig up my svn account and commit the fix if necessary. :-)

Thanks!
Comment 21 Andrew Lake 2011-02-23 05:49:21 UTC
I made good faith effort to get this Deletion Blacklist patch in to the 4.6 branch myself. However, I'm having trouble setting up a full KDE devel environment so I can compile and test the patch.

I'm really hoping Vishesh or Sebastian can get this patch in before 4.6.1 tagging (a little over a day from now) to fix the regression that occurred with the 4.6.0 release.
Comment 22 Sebastian Trueg 2011-02-23 13:19:06 UTC
Git commit 6d1c40b71dc7e80495e992f8c3d0ea12a073ef40 by Sebastian Trueg. on behalf of Vishesh Handa
Committed on 03/01/2011 at 19:53.
Pushed by trueg into branch 'KDE/4.6'.

Temporary workaround to make Nepomuk work with Network Attached Storages.

There is now a list of blacklisted urls which will not be checked in the Invalid-Resource-File-Cleaner.
The required config in nepomukserverrc is -

[Service-nepomukfilewatch]
Deletion Blacklist[$e]=/home/v/,/media/juil/

BUG: 250776

svn path=/trunk/KDE/kdebase/runtime/; revision=1211346

M  +25   -2    nepomuk/services/filewatch/invalidfileresourcecleaner.cpp     

http://commits.kde.org/kde-runtime/6d1c40b71dc7e80495e992f8c3d0ea12a073ef40
Comment 23 Vishesh Handa 2011-02-24 03:12:33 UTC
Thanks. I really have to learn how to backport bugs with git!
Comment 24 Elias Probst 2011-08-01 04:38:38 UTC
This temporary hack was removed in 4.7.0, as support for removable media was introduced.

Unfortunately, information on files on NFS shares still get deleted by the cleaner.

The hack needs to be re-introduced for 4.7.1 or the NFS issue needs to be fixed.

The commit which removed the blacklist hack:
https://projects.kde.org/projects/kde/kdebase/kde-runtime/repository/revisions/59f97850ce31fe59c3a35bc4fd839af1b3646dc3
Comment 25 Vishesh Handa 2013-06-10 16:58:04 UTC
This was fixed with 4.10.  Please feel free to re-open if it occurs again.