Bug 261926 - amarok collection scan does not follow symlinks
Summary: amarok collection scan does not follow symlinks
Status: RESOLVED FIXED
Alias: None
Product: amarok
Classification: Applications
Component: Collections/Local (show other bugs)
Version: 2.3.1
Platform: Debian unstable Linux
: NOR wishlist
Target Milestone: 2.4.0
Assignee: Amarok Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-01-03 04:32 UTC by Craig Howard
Modified: 2012-09-29 17:23 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In: 2.4


Attachments
Symlink Collection Scanner Test (11.53 KB, text/plain)
2011-03-17 14:59 UTC, dim.kde
Details
Patch that seems to correct the symlink problem (2.31 KB, patch)
2011-03-21 11:27 UTC, dim.kde
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Craig Howard 2011-01-03 04:32:22 UTC
Version:           2.3.1 (using KDE 4.4.5) 
OS:                Linux

I just put my music in git-annex (using SHA1 backend).  This system works by moving all actual files to .git/annex/objects/SHA1:<sha1_of_file> and replacing the file with a symlink to the above.  For example:

13:34:47 [1011]; pwd                                                                                                                                                                            
/home/craig/local/mp3/2005 Walk The Line
19:26:29 [1012]; ls -l
total 0
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 01 - Get Rhythm.mp3 -> ../.git/annex/objects/SHA1:f09f515661af71e6dd4d5e64eaecf107e479fb09/SHA1:f09f515661af71e6dd4d5e64eaecf107e479fb09
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 02 - I Walk The Line.mp3 -> ../.git/annex/objects/SHA1:5267c91953fcf1bc03f547ff29d27b300c2b7bf8/SHA1:5267c91953fcf1bc03f547ff29d27b300c2b7bf8
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 03 - Wildwood Flower.mp3 -> ../.git/annex/objects/SHA1:9841f25071e32d20d38ef5aa39b439d2d34aec96/SHA1:9841f25071e32d20d38ef5aa39b439d2d34aec96
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 04 - Lewis Boogie.mp3 -> ../.git/annex/objects/SHA1:e83352b482ddacc27e506b9d56c817ad0d7640cf/SHA1:e83352b482ddacc27e506b9d56c817ad0d7640cf
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 05 - Ring Of Fire.mp3 -> ../.git/annex/objects/SHA1:194ae88f3a578658a543df7cb4a951f549031de0/SHA1:194ae88f3a578658a543df7cb4a951f549031de0
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 06 - You're My Baby.mp3 -> ../.git/annex/objects/SHA1:8bcdf003509f76be6f12fffe2185dbe5262a6d34/SHA1:8bcdf003509f76be6f12fffe2185dbe5262a6d34
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 07 - Cry Cry Cry.mp3 -> ../.git/annex/objects/SHA1:966f7b44f17cfedddca0a83d76826051b8c82171/SHA1:966f7b44f17cfedddca0a83d76826051b8c82171
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 08 - Folsom Prison Blues.mp3 -> ../.git/annex/objects/SHA1:678a3bfd703da7fee7f972b265cba44877c8bf96/SHA1:678a3bfd703da7fee7f972b265cba44877c8bf96
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 09 - That's All Right.mp3 -> ../.git/annex/objects/SHA1:1d82e70e1bb8cfcab11c7e62eb4f671ebcd92124/SHA1:1d82e70e1bb8cfcab11c7e62eb4f671ebcd92124
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 10 - Juke Box Blues.mp3 -> ../.git/annex/objects/SHA1:8249f7986977e18692afabab5488feb2bd623d0c/SHA1:8249f7986977e18692afabab5488feb2bd623d0c
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 11 - It Ain't Me Babe.mp3 -> ../.git/annex/objects/SHA1:89981b963fcd98eb010e891ec5e4c4d07f19ef88/SHA1:89981b963fcd98eb010e891ec5e4c4d07f19ef88
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 12 - Home Of The Blues.mp3 -> ../.git/annex/objects/SHA1:f64cb204dc9040efae16890f3589f9fc2ada7933/SHA1:f64cb204dc9040efae16890f3589f9fc2ada7933
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 13 - Milk Cow Blues.mp3 -> ../.git/annex/objects/SHA1:02195b107e803c822b6378d9777a6a99e14206e5/SHA1:02195b107e803c822b6378d9777a6a99e14206e5
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 14 - I'm A Long Way From Home.mp3 -> ../.git/annex/objects/SHA1:3f017ac3157eecddd5dde885eb51922be08d97e0/SHA1:3f017ac3157eecddd5dde885eb51922be08d97e0
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 15 - Cocaine Blues.mp3 -> ../.git/annex/objects/SHA1:a7971bf3dfec21f7eeb523b7ed5efe60dd5e664e/SHA1:a7971bf3dfec21f7eeb523b7ed5efe60dd5e664e
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 16 - Jackson.mp3 -> ../.git/annex/objects/SHA1:aa842910bb7d3eb1dc823b53e318b4a8c1e6fe39/SHA1:aa842910bb7d3eb1dc823b53e318b4a8c1e6fe39
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 2005 Walk The Line.txt -> ../.git/annex/objects/SHA1:93f4de8f77792811c102c4cc4f200aebdaa40f8b/SHA1:93f4de8f77792811c102c4cc4f200aebdaa40f8b
lrwxrwxrwx 1 craig craig 113 Dec 25 10:44 2005WTL.jpg -> ../.git/annex/objects/SHA1:1c543478553fb50bb2057338c20bfa144b30e969/SHA1:1c543478553fb50bb2057338c20bfa144b30e969

The file the link points to is read-only:

19:26:31 [1013]; stat 01\ -\ Get\ Rhythm.mp3
  File: `01 - Get Rhythm.mp3' -> `../.git/annex/objects/SHA1:f09f515661af71e6dd4d5e64eaecf107e479fb09/SHA1:f09f515661af71e6dd4d5e64eaecf107e479fb09'
  Size: 113             Blocks: 0          IO Block: 4096   symbolic link
Device: fe02h/65026d    Inode: 3846251     Links: 1
Access: (0777/lrwxrwxrwx)  Uid: ( 1000/   craig)   Gid: ( 1000/   craig)
Access: 2010-12-25 10:44:18.000000000 -0800
Modify: 2010-12-25 10:44:18.000000000 -0800
Change: 2010-12-25 10:44:18.000000000 -0800

19:27:12 [1014]; stat -L 01\ -\ Get\ Rhythm.mp3                                                                                                                                                 
  File: `01 - Get Rhythm.mp3'
  Size: 2351421         Blocks: 4604       IO Block: 4096   regular file
Device: fe02h/65026d    Inode: 2883333     Links: 1
Access: (0444/-r--r--r--)  Uid: ( 1000/   craig)   Gid: ( 1000/   craig)
Access: 2006-02-03 20:16:04.000000000 -0800
Modify: 2006-02-03 20:16:04.000000000 -0800
Change: 2010-12-25 10:44:18.000000000 -0800

Now that all my files are symlinks to read-only files, my amarok collection is now empty.  I can play the files through the symlinks fine in Juk, so there's nothing wrong with the files themselves.

Reproducible: Always

Steps to Reproduce:
Setup a collection where the music files are actually symlinks to read-only files.  Rebuild the collection and see that it's empty.

Actual Results:  
The collection is empty.

Expected Results:  
The collection should treat each symlink as the file that is being pointed to and populate the collection.

OS: Linux (i686) release 2.6.32-5-686
Compiler: cc
Comment 1 Myriam Schweingruber 2011-01-04 15:28:52 UTC
I don't think this is possible in Amarok anyway, moving to wishlist.
Comment 2 Craig Howard 2011-01-05 04:30:57 UTC
I'm surprised to hear that.  On a unix platform, the client software has to take special action to see the symlink instead of the contents of the file.  I dug in a bit.

In CollectionScanner::readDir(), this seems to be the line killing me:

 384         const QFileInfo &f = fi.isSymLink() ? QFileInfo( fi.symLinkTarget() ) : fi;

I found some old discussions about being careful about multiple symlinks pointing to the same file, but could there not be a config option in amarok (and corresponding flag in amarokcollectionscanner) to determine if this line should be run for those that know their symlinks only point to a single file, or prefer proper symlink semantics?  I'm pretty sure this would solve my problem, but I don't have a dev environment setup to test it at the moment.  Am I missing something?

I'm surprised to see this marked as wishlist, as this particular bug makes amarok unusable for me (my collection has zero items).  The old discussions seem to indicate amarok 1.x had this ability.
Comment 3 Myriam Schweingruber 2011-01-05 11:26:17 UTC
Well, Amarok 2.x is a complete rewrite, so new function not yet implemented belong to the wishlist. I notified the developer.
Comment 4 Ralf Engels 2011-01-05 12:02:46 UTC
Hi,
The collection scanner does follow symlinks.
We have had a problem with that just yesterday where a user placed a symlink to a very slow NTFS partition inside his collection.

However I did not verify that links to files work. I will do that right now.
Comment 5 Ralf Engels 2011-01-05 12:53:40 UTC
Just checked it.
It's working all right with the 2.4 beta.

Links to read-only audio file. No problem.
Comment 6 Myriam Schweingruber 2011-01-05 14:36:13 UTC
Closing as fixed in the upcoming Amarok 2.4, to be released later this month.
Comment 7 dim.kde 2011-03-15 14:22:03 UTC
Hello,

It seems to me that this problem was solved in v2.3.2 but was reintroduced in v2.4.0:

/tmp/amarok% git checkout v2.3.1
Previous HEAD position was a38b0e5... bump plugin-version for 2.3.2
HEAD is now at 25111af... bump plugin version for release
/tmp/amarok% grep -R isSymLink *           
utilities/collectionscanner/CollectionScanner.cpp:        const QFileInfo &f = fi.isSymLink() ? QFileInfo( fi.symLinkTarget() ) : fi;
/tmp/amarok% git checkout v2.3.2 
Previous HEAD position was 25111af... bump plugin version for release
HEAD is now at a38b0e5... bump plugin-version for 2.3.2
/tmp/amarok% grep -R isSymLink * 
utilities/collectionscanner/CollectionScanner.cpp:        if( !fi.exists() || ( fi.isSymLink() && !QFileInfo( fi.symLinkTarget() ).exists() ) )
/tmp/amarok% git checkout v2.4.0 
Previous HEAD position was a38b0e5... bump plugin-version for 2.3.2
HEAD is now at b52a7b6... bump plugin version for 2.4.0
/tmp/amarok% grep -R isSymLink * 
utilities/collectionscanner/Directory.cpp:        const QFileInfo &f = fi.isSymLink() ? QFileInfo( fi.symLinkTarget() ) : fi;
utilities/collectionscanner/CollectionScanner.cpp:        const QFileInfo &f = fi.isSymLink() ? QFileInfo( fi.symLinkTarget() ) : fi;


Thanks,

Dimitri
Comment 8 Ralf Engels 2011-03-16 22:08:06 UTC
Dimitri,
I checked this problem with 2.4 and haven't changed anything substential since.

I also don't see what a grep over the different Amarok versions should prove.
Can you please execute the amarok_collectionscanner for such a directory and tell me if it finds the files.

Also, since you seem to be proficient with git, you could try out the git version and report if the error still occures.
Comment 9 dim.kde 2011-03-17 14:59:17 UTC
Created attachment 58121 [details]
Symlink Collection Scanner Test
Comment 10 dim.kde 2011-03-17 15:00:04 UTC
Hello,

Ok I have done test with version 2.3.1 2.3.2 2.4.0 and 2.4-GIT. First I setup a directory with 2 mp3 file, 1 on the root path and the other on a subdirectory. Then I do a amarokcollectionscanner -r for each version (2.3.1 2.3.2 2.4.0 and 2.4-GIT) and everything works well. After that, I setup git annex, so the mp3 file are only symlinks. And I launch another amarokcollectionscanner. The result seems to be good only with the 2.3.2 version. 

That's why I was doing grep on my previous comment.

The results of my test are in attachment.

Thanks for your help,

Dimitri
Comment 11 dim.kde 2011-03-21 11:27:38 UTC
Created attachment 58207 [details]
Patch that seems to correct the symlink problem

Hello,

As I really wanted to use the symlink of git-annex, I searched a little and I found this commit of Jeff Mitchell (at the bottom) had resolved this problem the first time. As this modification was squashed, I try to backport this in the 2.4.0 version and it seems to works succesfully for me. The patch is in attachment.

As I'm not a developper and even less a c++ developper, you should be carefully about those modification ;).

Thank you,

Dimitri

commit 30c85f027557318d1c64b6b19c46903f62eaa4e4
Author: Jeff Mitchell <mitchell@kde.org>
Date:   Fri Aug 27 16:51:28 2010 -0400

    Resolving directories' symlinks during scanning meant that directories could be seen as "out of the collection" and show up during full scans but disappear during incremental scans.

    Thanks to tampakrap (Theo Chatzimichos) for helping track this down.

diff --git a/utilities/collectionscanner/CollectionScanner.cpp b/utilities/collectionscanner/CollectionScanner.cpp
index f0cef67..cbe55a1 100644
--- a/utilities/collectionscanner/CollectionScanner.cpp
+++ b/utilities/collectionscanner/CollectionScanner.cpp
@@ -379,12 +379,10 @@ CollectionScanner::readDir( const QString& dir, QStringList& entries )
     QStringList recurseDirs;
     foreach( const QFileInfo &fi, list )
     {
-        if( !fi.exists() )
+        if( !fi.exists() || ( fi.isSymLink() && !QFileInfo( fi.symLinkTarget() ).exists() ) )
             break;
 
-        const QFileInfo &f = fi.isSymLink() ? QFileInfo( fi.symLinkTarget() ) : fi;
-
-        if( f.isDir() && m_recursively && !m_scannedFolders.contains( f.canonicalFilePath() ) )
+        if( fi.isDir() && m_recursively && !m_scannedFolders.contains( fi.absoluteFilePath() ) )
         {
             //The following D-Bus call is used to see if a found folder is new or not
             //During an incremental scan the scanning isn't really recursive, as all folders
@@ -395,16 +393,16 @@ CollectionScanner::readDir( const QString& dir, QStringList& entries )
             bool isInCollection = false;
             if( m_incremental && m_amarokCollectionInterface )
             {
-                QDBusReply<bool> reply = m_amarokCollectionInterface->call( "isDirInCollection", f.canonicalFilePath() );
+                QDBusReply<bool> reply = m_amarokCollectionInterface->call( "isDirInCollection", fi.absoluteFilePath() );
                 if( reply.isValid() )
                     isInCollection = reply.value();
             }
 
             if( !m_incremental || !isInCollection )
-                recurseDirs << QString( f.absoluteFilePath() + '/' );
+                recurseDirs << QString( fi.absoluteFilePath() + '/' );
         }
-        else if( f.isFile() )
-            entries.append( f.absoluteFilePath() );
+        else if( fi.isFile() )
+            entries.append( fi.absoluteFilePath() );
     }
     foreach( const QString &dir, recurseDirs )
         readDir( dir, entries );
Comment 12 Matthias Pfafferodt 2012-09-26 19:30:23 UTC
I have the same problem as dim.kde also using git annex. Could using symlinks in the collection be added as an option?
Comment 13 Matthias Pfafferodt 2012-09-27 19:25:19 UTC
I did checked it now with different settings - one has to include the ./.git/annex dir with the 'real' files in the collection and then it is working!
Comment 14 Matthias Pfafferodt 2012-09-29 17:23:39 UTC
It is working due to the fact, that the annexed files are included in the collection and sorted due to the included tags. The links are not used at all. This helps as long as all files have tags ... For some files without tags an option to follow symlinks would be helpful. But, the current situation is working OK.