Bug 155938

Summary: support for MS Word documents
Product: [Applications] okular Reporter: Dario Panico <dariopnc>
Component: New backend wishesAssignee: Okular developers <okular-devel>
Status: RESOLVED FIXED    
Severity: wishlist CC: aacid, angel_blue_co2004, ankit93100, geroxp, giecrilj, hanswchen, kde-2011.08, luigi.toscano, silopolis
Priority: VLO    
Version: unspecified   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: attachment-16949-0.html

Description Dario Panico 2008-01-16 19:48:55 UTC
Version:           0.6.80 (using 4.00.80 (KDE 4.0.80 >= 20080104), compiled sources)
Compiler:          gcc
OS:                Linux (x86_64) release 2.6.22-14-generic

I think to have installed all the libraries for okular but I'm still unable to open *.txt and *.doc. I think that those features are not implemented in okular but I also think that those could be really useful to let okular become a complete file viewer
Comment 1 Pino Toscano 2008-01-16 20:36:15 UTC
I don't agree to adding support for plain text files: we have really good text editors that do the job perfectly, and using Okular to just read a simple text file is like opening a full-blown audio editing suite (eg Audacity) just to play a sound.

For *.doc, it is very low priority.
Comment 2 Aldoo 2008-01-17 13:42:52 UTC
@#1: the parallel isn't quite right, as okular is not an editor but only a viewer.
Using okular to display .pdf, .ps, and why not also .txt and .doc files, is no more overkill than using kaffeine to play .avi, .mkv, .wav or .mp3 files.
An other question is also when you click on a .wav, do you usually want to just play it, or edit it ?
Of course, the answer depends on the use case, but I think one can ask the same question for txt files, for which opening a text editor is not always the best answer.

Furthermore, Okular could have options for a good displaying (pretty printing) of text files, which should be quite different from the presentation of text files for edition.
One way to do that should be to use the a2ps tool (or whatever lib it uses), which transforms a lot of file formats into a nice postscript, and then to display the postscript in Okular.
Comment 3 Dotan Cohen 2008-06-21 16:29:20 UTC
In my opinion Okular should open text files for viewing. There are cases where opening a text file in an editor is not comfortable. For instance, e-books saved as plain text are very difficult to read in a text editor as the cursor location must be taken into account when scrolling up or down. Accidental keystrokes are annoying as well. In Okular, one could open the file and browse it comfortably.
Comment 4 Roman I Khimov 2008-08-01 21:52:51 UTC
Seems like there is two wishes, not one. :)

IMO, plain text would be really nice to handle and not that hard to do.

While MS Word... Well, nice to have in theory, but I think not real until there is some common library that handles .doc's good enough...
Comment 5 Brad Hards 2008-08-02 01:33:37 UTC
Roman,

Would you like to give the plain text generator a try? Then we could see how intrusive it is, and consider whether it is appropriate for okular?
Comment 6 David Dempster 2008-08-09 03:13:58 UTC
Hmm, this may be a step in the wrong direction.  Word .doc files are drafts rather than final documents.  They can look very different on different systems.  If we're going to support .doc, we should support .odt, .abw and .tex too.  In fact, I'd prefer to see those supported rather than a closed format.

<goes to do some tests>

Okular supports .odt!  And as you might expect, not very well.  My CV's layout is totally messed up by Okular.  I don't think you can expect a document viewer to deal with draft formats properly.  Just open it in OpenOffice.org, and expect to have to change fonts etc to make it work on your system.

As for plain text, well, you can't really go wrong with it.  It may be unnecessary, but it can do no harm.  I don't imagine the .txt backend will generate many bug reports.
Comment 7 Dotan Cohen 2008-08-09 11:20:33 UTC
> Word .doc files are drafts rather than final
> documents. They can look very different on
> different systems.

While it is true that Word documents look different on different versions of MS Office and service packs, different Windows OSs, different screen resolutions, and even different colour depth, the differences are very minor and are caused by programming errors in MSO, not intentional. However, that does not make .doc files 'drafts', in fact, in most environments that I have shared documents in (work, university, and even FWD: emails) .doc is used to share the final copy. Not because the format is ideal, but rather because it is a common format in the MS world.

I also do not support 'de facto' standards, however, that is an argument for not making .doc the native format of any program / organization. It is NOT and argument for not having a document viewer support the format. Be conservative in what you produce, and liberal in what you accept, no?

I would assume (I cannot check at the moment, sorry) that Okular's .doc rendering would be based upon KOffice's rendering. If so, as KOffice's rendering is poor, so then would be Okular's. Open Office has a very accurate .doc rendering algorithm as of version 2.4.

Can the OOo renderer be used as a plugin to Okular?
Comment 8 Brad Hards 2008-08-10 01:50:42 UTC
On Saturday 09 August 2008 07:20:35 pm Dotan Cohen wrote:
> I would assume (I cannot check at the moment, sorry) that Okular's .doc
> rendering would be based upon KOffice's rendering. If so, as KOffice's
> rendering is poor, so then would be Okular's. Open Office has a very
> accurate .doc rendering algorithm as of version 2.4.

I don't think Okular has any .doc rendering. What makes you think it does?

> Can the OOo renderer be used as a plugin to Okular?

Probably, if someone was willing to do the (fairly significant) amount of 
coding work. Given the relatively small number of Okular developers, I'd 
personally prefer to see that work put into other formats, but that isn't 
quite how free software development works.

Presumably there will be .doc support if someone codes it.

Brad
Comment 9 Dotan Cohen 2008-08-10 14:56:07 UTC
> I don't think Okular has any .doc rendering. What
> makes you think it does? 

Comment #6, where I mistook .odt support for .doc. Sorry.

> Probably, if someone was willing to do the (fairly
> significant) amount of coding work.

Open Office can convert a .doc to PDF from the command line (no need to open GUI component). The resulting PDF can then be rendered by Okular. This should be relatively simple to code, and will offload the .doc rendering to a project that has already invested many resources in implementing .doc, and continues to do so. In fact, by this method, Okular could render anything that Open Office could render. See this page for more information on OOo supported formats and converting between them via OOo:
http://dag.wieers.com/home-made/unoconv/
Comment 10 Brad Hards 2008-08-10 22:58:17 UTC
> This should be relatively simple to code,

Do you have a patch?
Comment 11 Dotan Cohen 2008-08-11 21:14:59 UTC
> Do you have a patch?

No, I am not a programmer. I can cobble together a pipe on the command line, though, which would presumably (untested) feed Okular a PDF generated by OOo from a .doc file:
$ unoconv -b pdf file.doc > okular %U

I see that unoconv (http://dag.wieers.com/home-made/unoconv/) can use OOo to convert a .doc file to PDF. I see that Okular can display PDF files. As an engineer, I sometimes mistake "has many little gotchas" with "should be relatively simple". I believe that only mathematicians are freed from making that error, however.
Comment 12 gmud 2009-01-16 13:45:18 UTC
Just in case somebody is interested in a temporary solution to this problem, here is a script based on the unoconv program:

#!/bin/bash
unoconv --listener &
kdialog --title "Please wait..." --passivepopup "Please wait for the conversion to finish..." 20
sleep 20
unoconv -f pdf "$1"
kill -15 %-
okular "${1%.[^.]*}.pdf" &

Comment 13 Angel Blue01 2009-03-03 23:11:37 UTC
While I understand the difference between a viewer and an editor, Okular should support MS Office formats as well as OpenXML formats (at least these are an ISO standard), even if as someone else suggested, through a KOffice filter.
Comment 14 Dotan Cohen 2009-04-24 19:52:58 UTC
> If we're going to support .doc, we should
> support .odt, .abw and .tex too.  In fact,
> I'd prefer to see those supported rather
> than a closed format.

As a document viewer, it would make sense for Okular to support those file types. File a feature request and I will vote for it.

It seems that the objection to .doc support is because it is a closed format. I do not think that anti-MS ideology should drive KDE development, rather, KDE should strive to make the best software possible even if that means supporting evil closed formats that have already been reverse engineered by KDE's own KOffice team.
Comment 15 Brad Hards 2009-04-25 05:40:28 UTC
I don't think it is anti-MS ideology, and okular does include some Microsoft-specific formats (e.g. XPS, and microsoft tags for TIFF).

Also, note that most of the Microsoft formats are at least partly documented (freely downloadable on MSDN). 

I think the reason why it hasn't been done is that the amount of effort required to view a format that is viewable by other tools is disproportionate to the return. It just doesn't seem worth it.

This is not to say that such a generator wouldn't be acceptable (i.e. this is a valid wishlist item), just that I don't see anyone volunteering to write such a thing (i.e. it isn't likely to get implemented).
Comment 16 Dotan Cohen 2009-04-25 11:57:31 UTC
> I think the reason why it hasn't been done is that the
> amount of effort required to view a format that is viewable
> by other tools is disproportionate to the return. It just
> doesn't seem worth it.

I can name at least five Linux PDF viewers that existed before Okular, in fact, two of them were KDE apps. So the "done before" argument is not valid, either.

Is is a simple fact that in many organizations (businesses, NGO's, universities, etc.) that the standard document format in use is the Word format. I don't like it either, but I need my document viewer to read this popular document format.
Comment 17 Christopher Yeleighton 2011-01-03 16:31:48 UTC
*** Bug 261930 has been marked as a duplicate of this bug. ***
Comment 18 Christopher Yeleighton 2011-01-03 16:36:37 UTC
(In reply to comment #1)
> I don't agree to adding support for plain text files: we have really good text
> editors that do the job perfectly, 

No we do not (Bug 250617).

> and using Okular to just read a simple text
> file is like opening a full-blown audio editing suite (eg Audacity) just to
> play a sound.

Just the opposite, as editing text is much more complex than reading it.
Comment 19 Hans Chen 2011-12-08 16:00:12 UTC
I don't know how it works "under the hood", so please forgive my ignorance, but would it be possible to use the Calligra Engine[1] to add support for .doc, .docx and .odf in Okular?

[1] http://blogs.kde.org/node/4512
Comment 20 Luigi Toscano 2022-07-24 13:47:19 UTC
For a long while, Calligra has provided two Okular generators which allows Okular to render the text documents and presentation formats supported by the suite. I guess this request can be closed?
Comment 21 silopolis 2022-07-24 14:19:24 UTC
Created attachment 150869 [details]
attachment-16949-0.html

--- Comment #20 from Luigi Toscano <luigi.toscano@tiscali.it> ---
> For a long while, Calligra has provided two Okular generators which allows
> Okular to render the text documents and presentation formats supported by
> the
> suite. I guess this request can be closed?
>

Does this mean simply installing Calligra suite brings these generators to
Okular?

If so, could some packaging adjustments allow to install those with just
the required dependency without having to pull the whole suite in?
Comment 22 Luigi Toscano 2022-07-24 14:41:30 UTC
(In reply to silopolis from comment #21)
> 
> Does this mean simply installing Calligra suite brings these generators to
> Okular?
> 
> If so, could some packaging adjustments allow to install those with just
> the required dependency without having to pull the whole suite in?

This is a distribution problem, or at most a Calligra issue, but not an issue in the source code of Okular. I'd say it's not much of a Calligra issue either, given that most of the size required comes from calligra-libs already.
Comment 23 Albert Astals Cid 2022-07-25 20:25:37 UTC
Agree with Luigi