Version: 1.7 (using KDE 4.7.2) OS: Linux Dolphin uses pdftotext too often on pdf files, which can lead to problems depending on how long each file takes to process, if the user hovers over the pdf files often, and in the case of a large pdf collection, can do this on every single pdf file hovered over. Reproducible: Sometimes Steps to Reproduce: Get a lot of PDF files (try some really big ones too), then hover over a whole bunch of them. Dolphin then starts a "pdftotext" process, which eats a bit of CPU, depending on how long it takes to process the PDF(s) (which can take a long time, especially given many PDF files). This happens somewhat consistently. Perhaps a way to cache the text or something, so it doesn't happen with the same files repeatedly, and also not doing it over many files at once (can happen if you have a bunch of PDF books in one folder). Actual Results: Lots of CPU eaten for a long time as the pdftotext process gets started for many PDF files (which take a long time to finish). Expected Results: A way to cache the text or something, and if CPU is going to be used, it doesn't happen every single time you hover over the same PDF files.
Thanks for the report. Actually the bug should get assigned to the PDF-plugin that analyzes the PDFs as it is out of scope of Dolphin how the parsing is done. But there are no clear maintainers for those plugins so let's keep this assigned for Dolphin. I don't plan to implement a custom caching algorithm for this as we have already one: Nepomuk. However I understand that due to the issues in the past with the indexer not everyone wants to enable Nepomuk (looks like the situation should get a lot better with 4.8 due to recent fixes but thats another story). So I'll leave this issue open in the hope that someone might want to check whether a more efficient approach for PDF parsing can be used.
This has been improved AFAIK in the corresponding analyzer in the meantime.