Bug 269300 - Collection scanner opens files O_RDWR
Summary: Collection scanner opens files O_RDWR
Status: RESOLVED DUPLICATE of bug 269395
Alias: None
Product: taglib
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 1.6.3
Platform: Unlisted Binaries Linux
: NOR normal
Target Milestone: ---
Assignee: Scott Wheeler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-24 11:35 UTC by Manuel Amador (Rudd-O)
Modified: 2011-03-26 10:41 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Manuel Amador (Rudd-O) 2011-03-24 11:35:24 UTC
Version:           2.4-GIT (using KDE 4.6.1) 
OS:                Linux

The collection scanner opens scanned files with mode O_RDWR.  This:

- contradicts the principle of least privilege (e.g., a hard-to-foresee bug in the scanner or hardware glitch could conceivably nuke a file)
- makes it impossible for the network caching fscache infrastructure to cache read files from a network device such as a NFS or CIFS share. 

I suggest that the scanner only open files readonly, and only when the time comes to write to the file (IF and ONLY IF it needs to write to a file), reopen the file O_RDWR.  This has a dual benefit:

1) It reduces risk of bugs that may come in the future.
2) It enables Amarok to take advantage of network filesystem caching, making future scans *dramatically* faster (especially on high-latency or low-bandwidth scenarios).

As an aside, now that the kernel supports posix_fadvise, it would be a good time to investigate making the scanner use FADVISE_DONTNEED after it's done with a file.  That way the kernel does not need to evict pages used by applications from the pagecache, which results in a system that is much more responsive after a collection scan, ESPECIALLY on low-memory machines.

Reproducible: Always




OS: Linux (x86_64) release 2.6.38karen.dragonfear
Compiler: gcc
Comment 1 Ralf Engels 2011-03-25 15:40:19 UTC
After checking the code I found that taglib is opening the files like this:

 // First try with read / write mode, if that fails, fall back to read only.
taglib/toolkit/tfile.cpp:90

Taglib needs to do it like this because it can't know if the caller would later on try to modify tags.
Also it doesn't know if the track is just afterwards used for playback.

Can you report this issue against taglib?
It would be easy to tell taglib beforehand that we only want to read the file and not re-use it afterwards.
Comment 2 Manuel Amador (Rudd-O) 2011-03-25 15:58:53 UTC
Why can't TagLib know when to reopen the file?  To the extent that I know, as soon as one of the methods that actually modifies the file is called, that's how TagLib knows that it needs write access to the file.  If it needs to do a permissions check *beforehand* to know if the file is modifiable, then TagLib can use access() for that.  Already amarok does a metric ton of access()es already...

I will upstream this.
Comment 3 Ralf Engels 2011-03-25 16:02:51 UTC
Of course TagLib could reopen the file once it needs to actually write it.
However it's currently not doing this.
Comment 4 Myriam Schweingruber 2011-03-26 09:59:12 UTC
reassigning
Comment 5 Lukáš Lalinský 2011-03-26 10:41:11 UTC
And now it's a duplicate :)

*** This bug has been marked as a duplicate of bug 269395 ***