Summary: | Most (but not all) of the PDF files cannot be handled correctly in Strigi (nepomukindexer cannot index them) | ||
---|---|---|---|
Product: | [Unmaintained] nepomuk | Reporter: | Vangelis <cyberang3l> |
Component: | fileindexer | Assignee: | Sebastian Trueg <sebastian> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | lacsilva, me, stephanolbrich |
Priority: | NOR | ||
Version: | 4.8 | ||
Target Milestone: | --- | ||
Platform: | Ubuntu | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | One more of the PDFs fails to get indexed |
Description
Vangelis
2012-02-24 02:35:26 UTC
Created attachment 69048 [details]
One more of the PDFs fails to get indexed
I have the same problem. With xmlindexer I get a lot of information about the pdfs (metadata and content), but nepomukindexer returns without printing anything. Monitoring with sopranocmd --dbus org.kde.NepomukStorage --model main monitor shows nothing. The files in question show nothing when opened in nepomukshell and show no hash in dolphin. I guess that bugs #285128 and #234069 could be could be clusterd in this one. This is a problem with the strigi analyser. In the repo there are two branches with alternative analisers: "newPdfAnalyzer" and "popplerPdfAnalyzer". Although in incomplete state, both these alternatives produce better results than the default pdf analiser. Please, could any of the developers involved take a stab at pushing any of these alternatives as the default? In KDE 4.10, we have moved away from Strigi and are using our own indexer based on poppler. I'm not marking this bug as fixed, as the indexer has not been thoroughly tested. It could still use some polish. I'll mark this as fixed, when I have tested it adequately. This new PDF analyzer works quite well :) |