Bug 397097 - .okular archive should store the original file
Summary: .okular archive should store the original file
Status: REPORTED
Alias: None
Product: okular
Classification: Applications
Component: PDF backend (show other bugs)
Version: 1.3.3
Platform: openSUSE Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-02 16:18 UTC by ederag
Modified: 2020-10-17 22:55 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ederag 2018-08-02 16:18:53 UTC
Save as .okular includes the annotations into the pdf.
This is unexpected, as the .okular archive usually stores the original file
together with the annotations in a separate metadata.xml
So the current treatment of pdf is inconsistent with the treatment of png, for instance.

The unfortunate but understandable change described in
https://forum.kde.org/viewtopic.php?t=122750
means that the separate annotations will not be possible any longer, right ?
In this case, a warning should be issued, such as
"Warning: the pdf file that is about to be stored inside the .okular archive will not be the original one; it will be modified to include the annotations"
(with a 'do no warn next time' option)
Comment 1 Albert Astals Cid 2018-08-08 21:56:10 UTC
Why do you want the annotations to be separate?

Why would the user care that in that .okular file the pdf is not the original one? Moreover, do we promise that in the .okular file will have the original file at all somewhere?
Comment 2 ederag 2018-08-09 01:32:31 UTC
(In reply to Albert Astals Cid from comment #1)
> Why would the user care that in that .okular file the pdf is not the
> original one?

This will be explained in point 1) below.


> Moreover, do we promise that in the .okular file will have the
> original file at all somewhere?

No promises made that I could recall,
but this is the way it has been working for years.
The preservation of the original was a cool feature of okular.
[once understood; I do agree that the beginning was uncomfortable]
Just pointing out a possible surprise,
that could be discovered too late by an unsuspecting user.


> Why do you want the annotations to be separate?

Please note that I do not "want" the annotations to be separate.
This would be cool, but I also understand that developer time is precious.

Sometimes it is good for orthogonality 
to have an intermediate structure (here the metadata),
common to all types, so that it is well tested,
and translate back and forth with backends.

But here, if I read you correctly (?) elsewhere,
keeping both would really be a burden for pdfs.
So this report was more about not losing the original without a warning,
and maybe having a possibility to save the original too.


Now if the question was about the use case :
1) Articles can be sometimes difficult to obtain,
   so the original is precious.
   A backup is possible, but requires more bookkeeping
   (two locations for each article, for save, rename, deletion)
   Preserving the original in the zip would be a trade-off between
   risk of corruption by the zip process [still have to read about this]
   and ease of handling.
   
   This bookkeeping was also necessary for the infrequent rename and deletion
   in the previous - docdata - versions,
   but it was mitigated by the fact that 
   a) the more frequent download step was straightforward 
      (just one save at the desired location),
   b) disk space was optimized, contrary to the "backup" solution, and
   c) it was possible to open some pdf on a cloud (or any site)
      and annotate it with okular
      without having to upload it back.

2) separate annotations are easy to search for and to modify "by hand",
   and more importantly here,
   would provide a way to automatize the migration to the new scheme:
   take the original, create the metadata.xml from docdata, and zip together.
Comment 3 ederag 2018-08-21 23:54:49 UTC
Another use case:
An email with administrative instructions attached as pdf.
With the previous version, it was possible to highlight the important parts,
now it has to be written back in the mail, which is risky,
and undesirable for official information like this.

All considered, I'll try to downgrade to 1.2.
Comment 4 ederag 2018-08-23 21:47:15 UTC
Done, here is the downgraded package, for openSUSE Leap15.0:
https://build.opensuse.org/package/show/home:ederag/okular-1.2
Comment 5 Christoph Feck 2018-08-30 22:00:23 UTC
Albert, does comment #2 provide the requested information? Please add a comment or change the bug status.
Comment 6 Albert Astals Cid 2018-08-30 22:52:44 UTC
I guess it does, still unconvinced this is a valid bug myself though
Comment 7 Sebastian Guttenberg 2019-09-29 08:58:55 UTC
About the question, whether it is really a bug:

In the documentation of okular 
https://docs.kde.org/stable5/en/kdegraphics/okular/annotations.html
it is written 
"Okular has the "document archiving" feature. This is an Okular-specific format for carrying the document plus various metadata related to it (currently only annotations)". 
And this is not true. Annotations are not stored in the metadata. 

A second point: if the annotations are not stored in the metadata anymore even in the .okular - archive, then what is still the purpose of the .okular archive?
What other metadata is saved there?
As far as I understand, the .okular format was created for the main purpose to have document and annotations at the same place, without changing the document. 
If the document is changed anyway, then the format seems quite superflous. 
It wouldn't be, however, if it behaved the old (and documented) way. 

I also completely agree with ederag about all the benefits of having the annotations separately. It's a cool thing that one can save them directly into the pdf if one wants, but it is really a pity that it has become mandatory ... 
Unfortunately in related bug reports it has already been made clear that the  feature of separate saving won't come back :(


I agree with all what ederag has writte.
Comment 8 ederag 2019-11-09 15:58:38 UTC
The okular developers have done a great job in general,
so the following is just an idea, not prying at all.

Here is a possible design that would clarify saving with annotations,
while bringing back the great external annotations as an option:
- keep two groups of annotations:  internal/external
  (saved inside / saved outside document).
  This is possible since annotations overlap is already accepted.
- global option to direct new annotations to internal or external by default.
  (and create an individual annotation property for that,
   so that it is possible to change an annotation destination at any time.
   This would also be a way to help migration if so desired
   )
- Potential concern:
  When an internal annotation is saved, the document size changes,
  so the external annotations file name, based on size, has to be updated.
  The number of annotation files might increase a lot ?
  not in the standard use cases: 
  in my case, a pdf with an existing internal annotation
  is just complemented with external annotations. No pdf file change.
  Opposite use case is no external annotations at all. No problem either.
  For an unlikely hybrid workflow, it should be possible to
  provide a mean to chose between 
  copying or just moving the external annotation file.
  => not blocking at all, actually.

From a user perspective this looks like the best of both worlds,
but of course, that would complexify the code,
and I do not know the internals of okular;
that might be even too difficult to implement and maintain.
Comment 9 lwkiepnr 2020-10-14 15:53:23 UTC
I highly agree with ederag and Sebastian Guttenberg remarks.
Has any consensus been reached regarding this bug?
At least regarding the .okular files?
Comment 10 Albert Astals Cid 2020-10-17 22:53:39 UTC
> A second point: if the annotations are not stored in the metadata anymore even in the .okular - archive, then what is still the purpose of the .okular archive?

Saving the annotations for formats that don't support annotations.
Comment 11 Albert Astals Cid 2020-10-17 22:55:31 UTC
Honestly i don't see the point, you all have very corner case use cases and want a generalistic tool to behave like you would like it.

Anyhow, if you can provide a set of patches (together with autotests) that makes sure the pdf file is not modifyied when saving annotations to a .okular file and those patches are not very invasive i guess we could take them.