Bug 499471 - Form editing: File bloat and data leakage
Summary: Form editing: File bloat and data leakage
Status: CONFIRMED
Alias: None
Product: okular
Classification: Applications
Component: PDF backend (show other bugs)
Version: 22.12.3
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-02-03 17:35 UTC by quazgar
Modified: 2025-02-19 08:12 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
A PDF document with a form, only default content, created with LibreOffice. (22.81 KB, application/pdf)
2025-02-03 17:35 UTC, quazgar
Details
pdf with form filled out (25.73 KB, application/pdf)
2025-02-04 18:09 UTC, quazgar
Details

Note You need to log in before you can comment on or make changes to this bug.
Description quazgar 2025-02-03 17:35:21 UTC
Created attachment 177933 [details]
A PDF document with a form, only default content, created with LibreOffice.

SUMMARY

Saving PDF files with forms a) increases the file size, potentially a lot and b) leaks history information.

When PDF files with filled forms are stored, every single keystroke in those forms is saved. So for example, when I enter the text "test1234", there will be PDF commands for all of: "t", "te", "tes", .... "test123", "test1234".

Especially for larger texts, this can significantly increase file size, memory and CPU usage and make some documents even unprintable (if the printer cannot handle these commands).

STEPS TO REPRODUCE
1. Open the attached
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 
macOS: 
(available in the Info Center app, or by running `kinfo` in a terminal window)
Linux/KDE Plasma: 
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 quazgar 2025-02-03 17:38:38 UTC
[Sorry, adding the attachment filed the bug report.]

STEPS TO REPRODUCE
1. Open the attached simple document with Okular.
2. Enter some text in the form
3. Save.
4. Open with a text editor, or run `strings` on it:

$ strings ~/Desktop/simple_form.filled_okular.pdf | grep '(Tes'
(Tes) Tj
(Test) Tj
(Test1) Tj
(Test12) Tj
(Test123) Tj
(Test1234) Tj
(Test1234) Tj


OBSERVED RESULT

The result contains all intermediate text versions.

EXPECTED RESULT

Only the last version should be retained.

SOFTWARE/OS VERSIONS
Operating System: Debian GNU/Linux 12
KDE Plasma Version: 5.27.5
KDE Frameworks Version: 5.103.0
Qt Version: 5.15.8
Kernel Version: 6.1.0-30-amd64 (64-bit)
Graphics Platform: offscreen
Processors: 12 × AMD Ryzen 5 3600 6-Core Processor
Memory: 15.5 GiB of RAM
Graphics Processor: AMD Radeon RX 6400

ADDITIONAL INFORMATION
Comment 2 quazgar 2025-02-03 17:39:35 UTC
Possibly related?
- #477153
- #470446
- #452260 (unlikely)
Comment 3 Luigi Toscano 2025-02-03 20:34:04 UTC
The version of okular mentioned here is obsolete. Please try a newer one, as several fixed were applied in newer releases of both Okular and the library used for PDF, poppler. You can try flatpak or snap packages if your distribution doesn't provide any newer package.

None of the bugs mentioned are related.
Comment 4 quazgar 2025-02-04 18:08:47 UTC
(In reply to Luigi Toscano from comment #3)
> as
> several fixed were applied in newer releases of both Okular and the library
> used for PDF, poppler. You can try flatpak

I just confirmed this with the 24.12.1 flatpak version. The output is not different with that version:

$ grep -a '(test1.*)' -C 5 simple_form.filled_okular.flatpak_24.12.1.pdf
<</Length 84 /Subtype /Form /BBox [0 0 101.602 66.448 ] /Resources <</Font 9 0 R >> >> stream
/Tx BMC
q
BT
0 0 0 rg /He 8 Tf 1 0 0 1 0 63.45 Tm 2.00 -8.00 Td
(test1) Tj
ET
Q
EMC

endstream
--
<</Length 85 /Subtype /Form /BBox [0 0 101.602 66.448 ] /Resources <</Font 9 0 R >> >> stream
/Tx BMC
q
BT
0 0 0 rg /He 8 Tf 1 0 0 1 0 63.45 Tm 2.00 -8.00 Td
(test12) Tj
ET
Q
EMC

endstream
--
<</Length 86 /Subtype /Form /BBox [0 0 101.602 66.448 ] /Resources <</Font 9 0 R >> >> stream
/Tx BMC
q
BT
0 0 0 rg /He 8 Tf 1 0 0 1 0 63.45 Tm 2.00 -8.00 Td
(test123) Tj
ET
Q
EMC

endstream
--
<</Length 87 /Subtype /Form /BBox [0 0 101.602 66.448 ] /Resources <</Font 9 0 R >> >> stream
/Tx BMC
q
BT
0 0 0 rg /He 8 Tf 1 0 0 1 0 63.45 Tm 2.00 -8.00 Td
(test1234) Tj
ET
Q
EMC

endstream
--
<</Length 87 /Subtype /Form /BBox [0 0 101.602 66.448 ] /Resources <</Font 9 0 R >> >> stream
/Tx BMC
q
BT
0 0 0 rg /He 8 Tf 1 0 0 1 0 63.45 Tm 2.00 -8.00 Td
(test1234) Tj
ET
Q
EMC

endstream

I will attach the file for reference.
Comment 5 quazgar 2025-02-04 18:09:51 UTC
Created attachment 177965 [details]
pdf with form filled out

Filled PDF and saved by Flatpak Okular 24.12.1
Comment 6 Bug Janitor Service 2025-02-19 03:46:44 UTC
🐛🧹 ⚠️ This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information, then set the bug status to REPORTED. If there is no change for at least 30 days, it will be automatically closed as RESOLVED WORKSFORME.

For more information about our bug triaging procedures, please read https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging.

Thank you for helping us make KDE software even better for everyone!
Comment 7 Sune Vuorela 2025-02-19 08:09:43 UTC
The bug exists, is easily reproducible but is a bug in poppler.

I'm unsure if the bug is in the add-to-form or clean-up-old-entries code. We should probably close it here and open a new in poppler.
Comment 8 Sune Vuorela 2025-02-19 08:12:49 UTC
(In reply to quazgar from comment #2)
> Possibly related?
> - #477153
> - #470446
> - #452260 (unlikely)

None of these are related.