Bug 479070 - Okular hangs when editing Chinese characters in inline note annotation
Summary: Okular hangs when editing Chinese characters in inline note annotation
Status: RESOLVED UPSTREAM
Alias: None
Product: okular
Classification: Applications
Component: general (show other bugs)
Version: 23.08.4
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-27 09:22 UTC by Keyu Tao
Modified: 2023-12-29 16:32 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
thread apply all bt full on okular with the "你好:" reproducer (24.03 KB, text/plain)
2023-12-27 09:22 UTC, Keyu Tao
Details
fc-list (27.45 KB, text/plain)
2023-12-28 11:32 UTC, Keyu Tao
Details
fc-list inside KDE Neon LiveCD (45.14 KB, text/plain)
2023-12-28 11:53 UTC, Keyu Tao
Details
thread apply all bt full from gcore dump (+ poppler debug symbols) (38.92 KB, text/plain)
2023-12-28 13:30 UTC, Keyu Tao
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Keyu Tao 2023-12-27 09:22:17 UTC
Created attachment 164470 [details]
thread apply all bt full on okular with the "你好:" reproducer

SUMMARY
When typing in with input methods (ibus or fcitx), or pasting Chinese characters and editing inside inline note, Okular hangs.


STEPS TO REPRODUCE
1. Open a PDF file, and show more annotation tools.
2. Choose "Inline Note", click on any place
3. Copy "你好:" to the note, move cursor and delete any character
4. Or, if you could type with input methods, type in any Chinese characters with methods like Pinyin, and type in ":" (aka, Shift + : with Chinese mode on)

OBSERVED RESULT

Okular hangs.

EXPECTED RESULT

Inline note being edited without any issue.

SOFTWARE/OS VERSIONS
Windows: N/A
macOS: N/A
Linux/KDE Plasma: Kernel 6.6.6-arch1-1
(available in About System)
KDE Plasma Version: N/A (I'm using GNOME 45)
KDE Frameworks Version: 5.113.0
Qt Version: 5.15.11

ADDITIONAL INFORMATION

This issue seems do not exist in Okular Windows version distributed by Windows store.
Comment 1 Keyu Tao 2023-12-27 09:27:58 UTC
* just tested under Wayland (QT_QPA_PLATFORM=wayland) and X (xcb) and this issue exists in both platforms.
Comment 2 Albert Astals Cid 2023-12-27 10:23:58 UTC
Are you sure it's hung or just doing work?

How much time have you waited before declaring it hung?
Comment 3 Keyu Tao 2023-12-27 10:51:47 UTC
(In reply to Albert Astals Cid from comment #2)
> Are you sure it's hung or just doing work?
> 
> How much time have you waited before declaring it hung?

3 minutes, with Okular process running with 100% CPU and 11G RAM.
Comment 4 Keyu Tao 2023-12-27 10:52:27 UTC
(In reply to Keyu Tao from comment #3)
> (In reply to Albert Astals Cid from comment #2)
> > Are you sure it's hung or just doing work?
> > 
> > How much time have you waited before declaring it hung?
> 
> 3 minutes, with Okular process running with 100% CPU and 11G RAM.

* The backtrace is captured just after okular is not responding, though.
Comment 5 Albert Astals Cid 2023-12-28 09:54:37 UTC
3 minutes is obviously a lot, it works here in like 1 second.

Does this problem happen in every file or in one in particular?
Comment 6 Keyu Tao 2023-12-28 10:27:29 UTC
(In reply to Albert Astals Cid from comment #5)
> 3 minutes is obviously a lot, it works here in like 1 second.
> 
> Does this problem happen in every file or in one in particular?

I can reproduce this on all PDF files I could find. I'll try using a KDE Neon VM to see if I could reproduce after I finish my dinner (as I only tested this in GNOME environment)...
Comment 7 Albert Astals Cid 2023-12-28 10:58:04 UTC
This may be related to the fonts you have installed.

Can you paste the output of fc-list ?
Comment 8 Keyu Tao 2023-12-28 11:32:19 UTC
Created attachment 164505 [details]
fc-list
Comment 9 Keyu Tao 2023-12-28 11:32:47 UTC
(In reply to Albert Astals Cid from comment #7)
> This may be related to the fonts you have installed.
> 
> Can you paste the output of fc-list ?

The fc-list stdout has been added to attachments list.
Comment 10 Keyu Tao 2023-12-28 11:49:56 UTC
(In reply to Keyu Tao from comment #6)
> (In reply to Albert Astals Cid from comment #5)
> > 3 minutes is obviously a lot, it works here in like 1 second.
> > 
> > Does this problem happen in every file or in one in particular?
> 
> I can reproduce this on all PDF files I could find. I'll try using a KDE
> Neon VM to see if I could reproduce after I finish my dinner (as I only
> tested this in GNOME environment)...

I could reproduce this with latest neon-developer-20231225-1605.iso livecd (KDE 5.91.90 with Okular 24.01.85, KDE Frameworks 5.248.0 and Qt 6.6.1).

Screen record and fc-list inside KDE Neon LiveCD would be attached later.
Comment 11 Keyu Tao 2023-12-28 11:53:16 UTC
Created attachment 164507 [details]
fc-list inside KDE Neon LiveCD
Comment 12 Keyu Tao 2023-12-28 11:55:55 UTC
(In reply to Keyu Tao from comment #10)
> (In reply to Keyu Tao from comment #6)
> > (In reply to Albert Astals Cid from comment #5)
> > > 3 minutes is obviously a lot, it works here in like 1 second.
> > > 
> > > Does this problem happen in every file or in one in particular?
> > 
> > I can reproduce this on all PDF files I could find. I'll try using a KDE
> > Neon VM to see if I could reproduce after I finish my dinner (as I only
> > tested this in GNOME environment)...
> 
> I could reproduce this with latest neon-developer-20231225-1605.iso livecd
> (KDE 5.91.90 with Okular 24.01.85, KDE Frameworks 5.248.0 and Qt 6.6.1).
> 
> Screen record and fc-list inside KDE Neon LiveCD would be attached later.

The screencast is too large to put here (more than 4M), 
and I didn't find a better place to store this so I just uploaded it to YouTube: https://www.youtube.com/watch?v=LUhQHRLHOnw.
Comment 13 Albert Astals Cid 2023-12-28 13:05:42 UTC
Still can't reproduce :/

Any change you could try install more debugging symbols so we get the place where it's potentially looping?

It seems it's here

#1  0x00007f2cbf52ba1c in  () at /usr/lib/libpoppler.so.133
#2  0x00007f2cbf50f5ba in  () at /usr/lib/libpoppler.so.133
#3  0x00007f2cbf513a22 in AnnotFreeText::generateFreeTextAppearance() () at /usr/lib/libpoppler.so.133

But not having function names nor line numbers makes it impossible to figure out what my be wrong.
Comment 14 Keyu Tao 2023-12-28 13:30:17 UTC
Created attachment 164516 [details]
thread apply all bt full from gcore dump (+ poppler debug symbols)
Comment 15 Albert Astals Cid 2023-12-28 17:26:26 UTC
Ok, i can reproduce, it happens if i draw a very small square for the inline annotation
Comment 16 Albert Astals Cid 2023-12-28 17:37:29 UTC
Not an okular bug, but a poppler bug.

https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1482
Comment 17 Keyu Tao 2023-12-28 18:49:42 UTC
(In reply to Albert Astals Cid from comment #15)
> Ok, i can reproduce, it happens if i draw a very small square for the inline
> annotation

Oops, I didn't find the correct way to use inline note... I was just clicking instead of dragging a rectangle out...
Comment 18 Keyu Tao 2023-12-29 05:10:56 UTC
(In reply to Albert Astals Cid from comment #16)
> Not an okular bug, but a poppler bug.
> 
> https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1482

https://gitlab.freedesktop.org/poppler/poppler/-/commit/8a9f1d3a84a9f3d66cd353b7fe1aef0b65a37c08 does not seem to have this fixed.
I compiled the latest master branch poppler locally, LD_PRELOADed when running okular, 
and I could still easily reproduce this bug.

Going to try debugging after lunch.
Comment 19 Keyu Tao 2023-12-29 05:52:51 UTC
(In reply to Keyu Tao from comment #18)
> (In reply to Albert Astals Cid from comment #16)
> > Not an okular bug, but a poppler bug.
> > 
> > https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1482
> 
> https://gitlab.freedesktop.org/poppler/poppler/-/commit/
> 8a9f1d3a84a9f3d66cd353b7fe1aef0b65a37c08 does not seem to have this fixed.
> I compiled the latest master branch poppler locally, LD_PRELOADed when
> running okular, 
> and I could still easily reproduce this bug.
> 
> Going to try debugging after lunch.

Very weird: even when Okular is responding, I find that one thread is already in dead loop inside `drawMultiLineText()` (when it does not have enough space and you type in non-ASCII chars): 

In this case `textLayouter.consumedText` is 2 and `text.hasUnicodeMarker()` is true, thus it keeps `i += 0` and it could never leave the `while (i < text.getLength())`.
Comment 20 Keyu Tao 2023-12-29 06:59:03 UTC
(In reply to Keyu Tao from comment #19)
> (In reply to Keyu Tao from comment #18)
> > (In reply to Albert Astals Cid from comment #16)
> > > Not an okular bug, but a poppler bug.
> > > 
> > > https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1482
> > 
> > https://gitlab.freedesktop.org/poppler/poppler/-/commit/
> > 8a9f1d3a84a9f3d66cd353b7fe1aef0b65a37c08 does not seem to have this fixed.
> > I compiled the latest master branch poppler locally, LD_PRELOADed when
> > running okular, 
> > and I could still easily reproduce this bug.
> > 
> > Going to try debugging after lunch.
> 
> Very weird: even when Okular is responding, I find that one thread is
> already in dead loop inside `drawMultiLineText()` (when it does not have
> enough space and you type in non-ASCII chars): 
> 
> In this case `textLayouter.consumedText` is 2 and `text.hasUnicodeMarker()`
> is true, thus it keeps `i += 0` and it could never leave the `while (i <
> text.getLength())`.

OK I have figured it out:

- `Annot::layoutText` first `*i = 2;` to skip UTF-16 BOM, then `*i += 2` in while as it is unicode (UTF-16), and then as it meets new font needed, it `*i -= unicode ? 2 : 1;`, thus finally `*i = 2`.
- In `HorizontalTextLayouter` constructor, `availableWidth` is a negative number (as it is too small), so `consumedText` would just be `2` -- consumes nothing but BOM.
- `drawMultiLineText` does not expect textLayouter eating no characters, thus resulting a dead loop.
Comment 21 Albert Astals Cid 2023-12-29 11:56:47 UTC
> In `HorizontalTextLayouter` constructor, `availableWidth` is a negative number

Ok, i can not reproduce HorizontalTextLayouter having negative  availableWidth.

How much do you know about programming? If i tell you do add some printf in the poppler code is that something you can do?

Also just to make sure, the problem doesn't happen if you draw a big rectangle, right?
Comment 22 Keyu Tao 2023-12-29 12:10:01 UTC
(In reply to Albert Astals Cid from comment #21)

> How much do you know about programming? If i tell you do add some printf in
> the poppler code is that something you can do?

Yes, as I got this conclusion just by adding `std::cout` inside poppler code :)

> Also just to make sure, the problem doesn't happen if you draw a big
> rectangle, right?

Yes.
Comment 23 Albert Astals Cid 2023-12-29 12:17:46 UTC
(In reply to Keyu Tao from comment #22)
> (In reply to Albert Astals Cid from comment #21)
> 
> > How much do you know about programming? If i tell you do add some printf in
> > the poppler code is that something you can do?
> 
> Yes, as I got this conclusion just by adding `std::cout` inside poppler code
> :)

Ok, can you please print
  rect->x2 
  rect->x1
  width
  borderWidth
  textwidth
  da.getFontPtSize()
before the call to
  const DrawMultiLineTextResult textCommands = drawMultiLineText(*contents, textwidth, form, *font, da.getFontName().getName(), da.getFontPtSize(), quadding, 0 /*borderWidth*/);
in
  generateFreeTextAppearance()


Also i'm guessing 

diff --git a/poppler/Annot.cc b/poppler/Annot.cc
index b4c4a771..8a50225f 100644
--- a/poppler/Annot.cc
+++ b/poppler/Annot.cc
@@ -3036,7 +3036,7 @@ public:
             *availableWidth -= blockWidth;
         }
 
-        while (newFontNeeded && (!availableWidth || *availableWidth > 0)) {
+        while (newFontNeeded && (!availableWidth || *availableWidth > 0|| (isUnicode && i == 2) || (!isUnicode && i == 0))) {
             if (!form) {
                 // There's no fonts to look for, so just skip the characters
                 i += isUnicode ? 2 : 1;


Will fix your problem too but i'd like to figure out how availableWidth gets to be negative
Comment 24 Keyu Tao 2023-12-29 12:30:29 UTC
(In reply to Albert Astals Cid from comment #23)
> (In reply to Keyu Tao from comment #22)
> > (In reply to Albert Astals Cid from comment #21)
> > 
> > > How much do you know about programming? If i tell you do add some printf in
> > > the poppler code is that something you can do?
> > 
> > Yes, as I got this conclusion just by adding `std::cout` inside poppler code
> > :)
> 
> Ok, can you please print
>   rect->x2 
>   rect->x1
>   width
>   borderWidth
>   textwidth
>   da.getFontPtSize()
> before the call to
>   const DrawMultiLineTextResult textCommands = drawMultiLineText(*contents,
> textwidth, form, *font, da.getFontName().getName(), da.getFontPtSize(),
> quadding, 0 /*borderWidth*/);
> in
>   generateFreeTextAppearance()
> 
> 
> Also i'm guessing 
> 
> diff --git a/poppler/Annot.cc b/poppler/Annot.cc
> index b4c4a771..8a50225f 100644
> --- a/poppler/Annot.cc
> +++ b/poppler/Annot.cc
> @@ -3036,7 +3036,7 @@ public:
>              *availableWidth -= blockWidth;
>          }
>  
> -        while (newFontNeeded && (!availableWidth || *availableWidth > 0)) {
> +        while (newFontNeeded && (!availableWidth || *availableWidth > 0||
> (isUnicode && i == 2) || (!isUnicode && i == 0))) {
>              if (!form) {
>                  // There's no fonts to look for, so just skip the characters
>                  i += isUnicode ? 2 : 1;
> 
> 
> Will fix your problem too but i'd like to figure out how availableWidth gets
> to be negative

Var outputs:

===
rect->x2: 25.7253
rect->x1: 23.9537
width: 1.77165
borderWidth: 1
textWidth: -2.22835
da.getFontPtSize(): 10
===

And this patch is working: I could not reproduce the hang with the patched while loop.
Comment 25 Albert Astals Cid 2023-12-29 16:32:35 UTC
Right either make it very small or have a wider border and it'll happen.

https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/1483