Bug 284314 - Dolphin eats up lots of CPU when hovering over many PDF files
Summary: Dolphin eats up lots of CPU when hovering over many PDF files
Status: RESOLVED FIXED
Alias: None
Product: dolphin
Classification: Applications
Component: general (show other bugs)
Version: 1.7
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Peter Penz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-10-18 01:12 UTC by James Roe
Modified: 2011-12-17 21:48 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In: 4.8.0
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description James Roe 2011-10-18 01:12:00 UTC
Version:           1.7 (using KDE 4.7.2) 
OS:                Linux

Dolphin uses pdftotext too often on pdf files, which can lead to problems depending on how long each file takes to process, if the user hovers over the pdf files often, and in the case of a large pdf collection, can do this on every single pdf file hovered over.

Reproducible: Sometimes

Steps to Reproduce:
Get a lot of PDF files (try some really big ones too), then hover over a whole bunch of them. Dolphin then starts a "pdftotext" process, which eats a bit of CPU, depending on how long it takes to process the PDF(s) (which can take a long time, especially given many PDF files). This happens somewhat consistently. Perhaps a way to cache the text or something, so it doesn't happen with the same files repeatedly, and also not doing it over many files at once (can happen if you have a bunch of PDF books in one folder).

Actual Results:  
Lots of CPU eaten for a long time as the pdftotext process gets started for many PDF files (which take a long time to finish).

Expected Results:  
A way to cache the text or something, and if CPU is going to be used, it doesn't happen every single time you hover over the same PDF files.
Comment 1 Peter Penz 2011-10-18 07:08:26 UTC
Thanks for the report. Actually the bug should get assigned to the PDF-plugin that analyzes the PDFs as it is out of scope of Dolphin how the parsing is done. But there are no clear maintainers for those plugins so let's keep this assigned for Dolphin.

I don't plan to implement a custom caching algorithm for this as we have already one: Nepomuk. However I understand that due to the issues in the past with the indexer not everyone wants to enable Nepomuk (looks like the situation should get a lot better with 4.8 due to recent fixes but thats another story).

So I'll leave this issue open in the hope that someone might want to check whether a more efficient approach for PDF parsing can be used.
Comment 2 Peter Penz 2011-12-17 21:48:23 UTC
This has been improved AFAIK in the corresponding analyzer in the meantime.