284314 – Dolphin eats up lots of CPU when hovering over many PDF files

Bug 284314 - Dolphin eats up lots of CPU when hovering over many PDF files

Summary: Dolphin eats up lots of CPU when hovering over many PDF files

Status:	RESOLVED FIXED

Alias:	None

Product:	dolphin
Classification:	Applications
Component:	general (other bugs)
Version First Reported In:	1.7
Platform:	Ubuntu Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Peter Penz

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-10-18 01:12 UTC by James Roe
Modified:	2011-12-17 21:48 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed/Implemented In:	4.8.0
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description James Roe 2011-10-18 01:12:00 UTC

Version:           1.7 (using KDE 4.7.2) 
OS:                Linux

Dolphin uses pdftotext too often on pdf files, which can lead to problems depending on how long each file takes to process, if the user hovers over the pdf files often, and in the case of a large pdf collection, can do this on every single pdf file hovered over.

Reproducible: Sometimes

Steps to Reproduce:
Get a lot of PDF files (try some really big ones too), then hover over a whole bunch of them. Dolphin then starts a "pdftotext" process, which eats a bit of CPU, depending on how long it takes to process the PDF(s) (which can take a long time, especially given many PDF files). This happens somewhat consistently. Perhaps a way to cache the text or something, so it doesn't happen with the same files repeatedly, and also not doing it over many files at once (can happen if you have a bunch of PDF books in one folder).

Actual Results:  
Lots of CPU eaten for a long time as the pdftotext process gets started for many PDF files (which take a long time to finish).

Expected Results:  
A way to cache the text or something, and if CPU is going to be used, it doesn't happen every single time you hover over the same PDF files.

Comment 1 Peter Penz 2011-10-18 07:08:26 UTC

Thanks for the report. Actually the bug should get assigned to the PDF-plugin that analyzes the PDFs as it is out of scope of Dolphin how the parsing is done. But there are no clear maintainers for those plugins so let's keep this assigned for Dolphin.

I don't plan to implement a custom caching algorithm for this as we have already one: Nepomuk. However I understand that due to the issues in the past with the indexer not everyone wants to enable Nepomuk (looks like the situation should get a lot better with 4.8 due to recent fixes but thats another story).

So I'll leave this issue open in the hope that someone might want to check whether a more efficient approach for PDF parsing can be used.

Comment 2 Peter Penz 2011-12-17 21:48:23 UTC

This has been improved AFAIK in the corresponding analyzer in the meantime.