Bug 141743 - Fails to clean up .kde/share/apps/kpdf
Summary: Fails to clean up .kde/share/apps/kpdf
Status: RESOLVED INTENTIONAL
Alias: None
Product: kpdf
Classification: Applications
Component: general (show other bugs)
Version: 0.5.5
Platform: Fedora RPMs Linux
: NOR wishlist
Target Milestone: ---
Assignee: Albert Astals Cid
URL:
Keywords:
: 187383 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-02-15 15:57 UTC by Paul Almquist
Modified: 2009-03-17 19:35 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Almquist 2007-02-15 15:57:58 UTC
Version:           0.5.5 (using KDE KDE 3.5.5)
Installed from:    Fedora RPMs
OS:                Linux

Every program needs to clean up it's temporary files.  I just found over 500 entries in .kde/share/apps/kpdf  directory dating back about 8 months.  No reason for all of this old stuff.  Maybe keep newest 10 or match list of recently opened pdf files.
Comment 1 Pino Toscano 2007-02-15 16:07:45 UTC
I see you have absolutely no idea what those files are.
First of all, they are *NOT* temporary files.
Each of them contains a set of metadata for a specific PDF file, like the history of the last 10 scrolled position, the bookmarked pages, and so on. They are used and saved for every PDF you open.
Comment 2 Paul Almquist 2007-02-15 19:05:53 UTC
Pino-

Thanks for your quick reply, and moreso thank you for all of the effort you 
have put into kpdf, it's a great program and I use it frequently. I used to 
use xpdf but like kpdf better.  I have but rarely use acroread.


I did look at the content of the files but the info was not meaningful to me. 
True, I have "absolutely no idea" how they are used.   But why is it 
necessary to have them "saved for every PDF you open."??  Many of those pdf 
files I opened months ago have been deleted.  This ancillary information is 
no longer needed.   When the pdf file is deleted then this support data 
should be deleted.  However, it would be quite a difficult to maintain a link 
between them short of embedding it in the pdf file, but is not likely an 
option.

I just did a little research.  Right now I have 732 pdf files in my account on 
my workstation and a few others in some system documentation directories.  
There 514 xml files in ~/.kde/share/apps/kpdf.  Of these 514 files 326 are 
for pdf files that no longer exist on my workstation.  About 65% of these xml 
files are obsolete.   This is since June 15, 2006 when I did a fresh install 
of Fedora Core 5 and began using kpdf. 

I know disk space is cheap but every one of those little files costs a 
directory entry, an inode, and data space not only on my workstation but they 
are also backed up daily costing cpu and network time and backup storage 
space.  Multiply the 65% by the number of xml files and that by the number of 
kpdf users worldwide.  That amounts to lots of time and space for obsolete 
data.  I suspect that most users and sys admins are not even aware of their 
existence.  I used to be a sys admin on a network had about 500 Linux/KDE 
users.  All of those accounts were backed up daily.

There has to be a house cleaning mechanism for these "*NOT* temporary" but 
obsolete files.

paul

On Thursday 15 February 2007 09:07, Pino Toscano wrote:
[bugs.kde.org quoted mail]
Comment 3 Pino Toscano 2007-02-15 19:17:30 UTC
I already said why do we write to those auxiliary files.
And no, we can do *nothing* about that, as the only info is about file name (not path, just the name) and its size, thus we have *no way* to know if a PDF files has been deleted.
And even if they are old, the PDF they refer to can be still valid, so the modification date of the files is not an usable information.
Comment 4 ndeb 2007-08-05 00:35:03 UTC
Please see my comment in http://bugs.kde.org/show_bug.cgi?id=130496#c4 .
Comment 5 ndeb 2007-08-05 00:39:40 UTC
I agree with Pino Toscano that it should not be kpdf's responsibility to *automatically* clean them up. But as I suggested in bug 130496, an *option* should be provided in kcmprivacy to clean up these files.
Comment 6 Pino Toscano 2007-09-01 20:15:03 UTC
For the reasons expressed in the previous comments, and given that it would be an unuseful overhead doing that check on every KPDF loading, I close this as WONTFIX. There's bug 130496, that should be the right way of making your system cleaner.
Comment 7 Toralf Förster 2008-05-08 19:26:38 UTC
I had 117 files in my ~/.kde/share/apps/kpdf, today I got 4 new files:

tfoerste@n22 ~/.kde/share/apps/kpdf $ ll
total 16
-rw-r--r-- 1 tfoerste users 156 May  8 11:31 3831.sudoku.pdf.xml
-rw-r--r-- 1 tfoerste users 156 May  8 11:35 3832.sudoku-2.pdf.xml
-rw-r--r-- 1 tfoerste users 156 May  8 11:35 3834.sudoku-3.pdf.xml
-rw-r--r-- 1 tfoerste users 156 May  8 11:31 3837.sudoku-1.pdf.xml

here's an example:

tfoerste@n22 ~/.kde/share/apps/kpdf $ cat 3831.sudoku.pdf.xml
<!DOCTYPE documentInfo>
<documentInfo>
 <bookmarkList/>
 <generalInfo>
  <history>
   <current viewport="0" />
  </history>
 </generalInfo>
</documentInfo>
Comment 8 Matthew Flaschen 2008-11-13 16:00:36 UTC
I think you are greatly exaggerating the overhead.  There should be an *option*, such as:

Automatically delete metadata XML files that are more than ____ days old.

This can then be implemented on startup or shutdown.  As for performance, let me throw some figures out:

I had 2664 XML files older than 90 days, out of a total of 2801.  I think that's a reasonable benchmark.  Then, it takes me 0.142 s to delete all of the files older than 90 days with:

time find -mtime +90 -exec rm {} \+

That's with 2664 files to handle.  If it were run more often, the total time would eventually be greater, but the marginal time would become very small.

Also, I know KPDF would probalby use direct library calls.  This is just a proof of concept.

I have no problem with doing it through kcmprivacy (though I don't see this as a privacy issue so much as a "delete stuff that isn't useful anymore" issue).  But KPDF documentation should tell the user how to do it.  Ideally, the option should be configurable without leaving KPDF.
Comment 9 Pino Toscano 2009-03-17 10:42:33 UTC
*** Bug 187383 has been marked as a duplicate of this bug. ***
Comment 10 Toralf Förster 2009-03-17 19:35:01 UTC
I'm wondering whether this feature (cleaning kpdf XML files within the privacy cleanup menue) is already implemented within KDE 4.2.x or not.