336084 – Excessive delay in returning meta-data for non-indexed files.

Bug 336084 - Excessive delay in returning meta-data for non-indexed files.

Summary: Excessive delay in returning meta-data for non-indexed files.

Status:	RESOLVED FIXED

Alias:	None

Product:	Baloo
Classification:	Unmaintained
Component:	Widgets (show other bugs)
Version:	4.13
Platform:	openSUSE Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Vishesh Handa

URL:
Keywords:

Duplicates (1):	338170 (view as bug list)
Depends on:
Blocks:

Reported:	2014-06-11 15:54 UTC by Paul
Modified:	2014-08-10 10:55 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:	http://commits.kde.org/baloo/434e3ef2500f64eb3ac2a4f656b47724d04d9c6f
Version Fixed In:	5.0
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Paul 2014-06-11 15:54:20 UTC

When running with 'Desktop Search' disabled 'libbaloowidgets4 / baloo_file_extractor' takes an excessive amount of time to return meta-data for non-indexed files.

For example, using this file: https://wiki.documentfoundation.org/images/3/35/WG40-WriterGuideLO.pdf

When opened in Okular is almost instant and there is no delay in displaying the meta-data (from Okular's 'Properties').

In Dolphin, mouse-over displays basic file information immediately, baloo_file_extractor then takes 100% of one processor core and, after approximately 8 seconds [1], the meta-data is displayed. (I wonder if it's actually needlessly indexing the entire file rather than just returning the meta-data.)

Thus in Dolphin rendering the use of the information panel completely impractical, at least for PDF files. 

There is more detailed discussion of this on the openSUSE forum, it's a long thread, this would be a good starting point: https://forums.opensuse.org/showthread.php/498098-KDE-4-13-1-Dolphin-Information-Panel-No-Meta-Data?p=2645577#post2645577


[1] Using a relatively modest PC: AMD Athlon 64X2 5600+, 4GB RAM, and using an SSD

Comment 1 Vishesh Handa 2014-06-12 11:36:32 UTC

Confirmed.

The indexer is temporarily indexing the entire file including the plain text, and not just the metadata. Hence the noticable delay.

I'll try to improve stuff, so that in this case only the metadata is extracted.

Comment 2 Paul 2014-06-12 15:07:52 UTC

(In reply to comment #1)
> I'll try to improve stuff, so that in this case only the metadata is
> extracted.

Excellent - Thanks. :)

For future flexibility perhaps baloo_file_extractor should take arguments to indicate what to return...
All meta-data, Specific Named meta-data, No meta-data, File Content... that sort of idea, then a programme calling baloo_file_extractor could specify exactly what it wanted.

Comment 3 Vishesh Handa 2014-07-01 14:55:15 UTC

Git commit 434e3ef2500f64eb3ac2a4f656b47724d04d9c6f by Vishesh Handa.
Committed on 01/07/2014 at 15:03.
Pushed by vhanda into branch 'frameworks'.

Extractor: Do not extract the plain text in --bdata mode

This is the mode that is used to temporarily extract the metadata. It's
used in the dolphin side panel. It doesn't make senese for us to extract
the plain text and then discard it. Extracting pdf metadata is now much
much faster.
FIXED-IN: 5.0

M  +6    -1    src/file/extractor/app.cpp
M  +2    -2    src/file/extractor/result.cpp
M  +1    -1    src/file/extractor/result.h

http://commits.kde.org/baloo/434e3ef2500f64eb3ac2a4f656b47724d04d9c6f

Comment 4 Frank Reininghaus 2014-08-10 10:55:34 UTC

*** Bug 338170 has been marked as a duplicate of this bug. ***