Bug 407127

Summary: po2xml produces broken XML in case: <para> text <itemizedlist/> text <xref/> </para>
Product: [Websites] docs.kde.org Reporter: Eric Bischoff <bischoff>
Component: ksgmltoolsAssignee: Documentation Editorial Team <kde-doc-english>
Status: REPORTED ---    
Severity: normal CC: bischoff
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Partially working patch
reproducer: full XML file
reproducer: translations
broken result

Description Eric Bischoff 2019-05-01 14:38:28 UTC
Created attachment 119767 [details]
Partially working patch

SUMMARY


STEPS TO REPRODUCE

1. Add the following text to some Docbook file

   <para>
     This is an nice list:<itemizedlist>
      <listitem><para>One</para></listitem>
      <listitem><para>Two</para></listitem>
      <listitem><para>Three</para></listitem>
    </itemizedlist>that ends up here: <xref linkend="somewhere"/></para>

2. Extract the messages with xml2pot
3. Translate English into your language, result is in some po file
4. Regenerate the XML file with po2xml

OBSERVED RESULT

Invalid XML file, with the end remaining in English.

EXPECTED RESULT

Properly translated XML file.

SOFTWARE/OS VERSIONS

Linux/KDE Plasma: Kubuntu 19.04 disco dingo
Qt: 5.12.2
KDE Frameworks: 5.56.0
kf5-config: 1.0


ADDITIONAL INFORMATION

The problem lies in poxml's parser.cpp lines 507 and 598. With that XML text, start_line and end_line are identical for variables msg1 and msg2. The blocks: 

   "This is a nice list:"

and

   "<listitem><para>One</para></listitem>
      <listitem><para>Two</para></listitem>
      <listitem><para>Three</para></listitem>
    </itemizedlist>that ends up here: <xref linkend="somewhere"/>"

(variables msg1 and msg2) are at the same level inside the containing <para>, it seems that this case has been overlooked.

I have a solution, but it only works if the limit between msg1 and msg2 is on the first line (see attachement). The problem I am unable to resolve is how to compute the column and line of the limit between msg1 and msg2 (i.e. strindex characters after the beginning of msg1).
Comment 1 Eric Bischoff 2019-05-01 14:43:56 UTC
poxml Version: 4:18.12.3-0ubuntu1, preferably add my latest patches (see in phabricator).

Adding full files to reproduce problem.
Comment 2 Eric Bischoff 2019-05-01 14:45:15 UTC
Created attachment 119768 [details]
reproducer: full XML file
Comment 3 Eric Bischoff 2019-05-01 14:45:49 UTC
Created attachment 119769 [details]
reproducer: translations
Comment 4 Eric Bischoff 2019-05-01 14:46:15 UTC
Created attachment 119770 [details]
broken result