Version: (using Devel) Search performance in Okular seems to have improved somewhat recently - thank-you for that! Searching large PDFs (several hundred pages or more, eg. standards documents) can still take quite a long time though, independent of whether searches have been performed before on the document. Perhaps Okular could build up an index (or use some other technique) to speed up multiple searches on the same document?
Actually it's slow now than two months ago, since before we cached all the pages text after the first page and now we only cache N pages because the previous setup meant exausting system memory quite easily, if you want to have faster searches set memory options to aggressive so more pages are kept cached on memory.
I find search performance unsatisfactory too, especially compared with KPDF. Take this rather large document for example: http://www.adobe.com/products/postscript/pdfs/PLRM.pdf . On KPDF, the first search (in the "thumbnails" pane) takes around 10 seconds on my machine, successive searches are nearly instantaneous. In Okular however, every search takes around 24 seconds, no matter whether it's the first one or not. The memory usage policy in both programs is "aggressive". Actual memory usage after 5 searches in the above document is 171MB for KPDF and 65MB for Okular; I'm thinking that Okular is just being to modest here; if I tell it that it's ok to gobble up memory in order to provide faster responses, then it should just go ahead and do that. In other words: I'd really like to get KPDF's instant searches back.
Hello to 12 years ago! I wrote a quick script to check how fast searching is when building an index first from nothing, and found that it’s nearly as slow as an *early* search in Okular. At some point, Okular seems to have built a text search index, but it takes much longer than the 21 seconds my script uses. I had Okular open for at least some minutes before the search got fast (maybe even longer, I spent half an hour to write the script). Maybe the index only gets built once the visual page cache is full or so? Tell me if I’m wrong, but I think we could improve search by building the index ASAP. https://gist.github.com/flying-sheep/27f99747f85abb20bab7dc732abe3f6a $ ./pdf_search.py '/home/phil/Dropbox/RPG/DSA/DSA 5/VR7 - Aventurische Magie III (2018).pdf' Geode Index time : 0:00:21.465361 Search time: 0:00:00.001145 [31, 32, 33, 34, 35, 36, 37, 38, 126, 166]