Bug 302004

Summary: Okular does not find any result when searching into DVI documents
Product: [Applications] okular Reporter: Laurent Claessens <moky.math>
Component: DVI backendAssignee: Okular developers <okular-devel>
Status: RESOLVED FIXED    
Severity: normal CC: aacid, luigi.toscano, mamun.nightcrawler, manisandro, rdieter
Priority: NOR Keywords: regression
Version: 0.14.3   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In: 4.9.4
Sentry Crash Report:
Attachments: dvi file that present the bug.
The source file of the DVI
dvi file that present the bug.

Description Laurent Claessens 2012-06-16 10:04:34 UTC
Searching for a text (ctrl-F) does not find any result.

It has to be quite recent because I had no problem with Ubuntu Oneiric : that problem appeared with Precice Pangolin

Reproducible: Always

Steps to Reproduce:
1.Open a DVI file
2.search for a word with ctrl-F
3.wait for end of search
Actual Results:  
Okular does not find the word

Expected Results:  
Okular should find the asked word

The DVI il generated by LaTeX.
when using the pdf file with pdflatex, I have no problem of searching.
Comment 1 Albert Astals Cid 2012-06-16 15:34:45 UTC
Please attach a dvi file with such a problem.
Comment 2 Laurent Claessens 2012-06-25 06:28:23 UTC
Created attachment 72107 [details]
dvi file that present the bug.

Strange : when I search for "my", okular finds the word. The other words don't work :(
For example, a search for "is" gives no result.
Comment 3 Laurent Claessens 2012-06-25 06:29:09 UTC
Created attachment 72108 [details]
The source file of the DVI
Comment 4 Laurent Claessens 2012-06-25 06:33:24 UTC
Created attachment 72109 [details]
dvi file that present the bug.

This is the same as the first one, apart that this time I correctly selected "auto-detect" for the file type.
Comment 5 Albert Astals Cid 2012-06-30 14:46:29 UTC
For future reference on people that try to fix the bug, it seems there's mysterious spaces in between the letters so searching for "t e s t  fi l e" works (no sapce between f and i because it's a ligature i guess).

Needs some more work to see if this is a regression introduced recently (maybe by the new text column selection code) or has been broken forever or is it just that the information given back by  the dvi file is broken.
Comment 6 Sandro Mani 2012-07-20 23:57:49 UTC
For me, search is completely broken with any PDF file. Specifically, as soon as I type anything (even only one letter) in the "Find", the cursor changes to "busy", and stays like so until I close the document. No occurrences are ever marked.
Using okular-4.8.97-1.fc18.x86_64. This is a regression compared to 4.8.x stable.
Comment 7 Rex Dieter 2012-07-22 13:19:49 UTC
odd, searching pdf's with okular-4.8.97-1.fc17.x86_64 seems ok for me.  I think this here is something dvi-specific
Comment 8 Albert Astals Cid 2012-07-22 13:21:27 UTC
@Sandro: This bug is DVI specific, please *DO NOT HIJACK* bugs
Comment 9 Luigi Toscano 2012-07-22 13:45:11 UTC
(In reply to comment #5)
> For future reference on people that try to fix the bug, it seems there's
> mysterious spaces in between the letters so searching for "t e s t  fi l e"
> works (no sapce between f and i because it's a ligature i guess).

Also, if you copy the entire text of the example file, the letters in the last line are shown before the ones in the first lines.
The extraction of the text from DVI did not change from the last version. I tried also the file with Okular 0.10.5 (KDE SC 4.4.5) and it works.

DVI gives one box for each letter, so, as you wrote... 

> Needs some more work to see if this is a regression introduced recently
> (maybe by the new text column selection code) or has been broken forever or
> is it just that the information given back by  the dvi file is broken.
... I'd say the former (regression).
Comment 10 Luigi Toscano 2012-11-27 00:04:38 UTC
Git commit 91e46331fd7901705a69323c75de84e2467416dd by Luigi Toscano.
Committed on 27/11/2012 at 00:59.
Pushed by ltoscano into branch 'KDE/4.9'.

Fix word detection for DVI documents

This patch attempts to restore the functionalities broken by some changes
(maybe the text column selection code, it was broken also in 4.7).

Text search and text selection work (almost) properly again.

It uses a bit of heuristics to identify the end of a word and merge the
boxes which enclose each character of a word (so that
char_x.right==char_{x+1}.left).
It also tries to recognize if there is a newline ("after_space") after
that a space is found.
REVIEW: 107429
FIXED-IN: 4.9.4

M  +58   -6    generators/dvi/dviRenderer_draw.cpp

http://commits.kde.org/okular/91e46331fd7901705a69323c75de84e2467416dd