Bug 493709 - Alternate (previous) translation doesn’t handle diff of messages with msgctxt properly
Summary: Alternate (previous) translation doesn’t handle diff of messages with msgctxt...
Status: RESOLVED FIXED
Alias: None
Product: lokalize
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Simon Depiets
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-09-26 19:30 UTC by Karl Ove Hufthammer
Modified: 2024-10-04 18:50 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 24.08.2
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Karl Ove Hufthammer 2024-09-26 19:30:24 UTC
SUMMARY
In the diff view showing the previous msgid (Alternate Translations pane), the msgctxt is merged with the msgid, resulting in a wrong diff. The first word in each string is shown to belong to the msgctxt, resulting in very long/confusing diffs.

This is easiest to show with an example:

#, fuzzy
#| msgctxt "@windowtitle"
#| msgid "that was a test"
msgctxt "@windowtitle"
msgid "this is a test"
msgstr "dette er ein test"

The diff view in the alternative translation pane says that the change is

@windowtitlethat was → @windowtitlethis is

But the correct diff is

that is → this is

The reason for the bug is the following lines in alttransview.cpp:

    prevMsgId.replace(QStringLiteral("\n"), QString());
    currentMsgId.replace(QStringLiteral("\n"), QString());

But in each of the two variables, the original string is stored as:

msgctxt + "\n" + msgid

And the string replacement removes all "\n" characters. The explanation is:

            // Messages have arbitrary word wrapping, which should
            // not affect the diff. So we remove any word wrapping
            // newlines. (Note that this does not remove manual \n
            // characters used in the messages.)

This makes sense, but unfortunately, it also removes the separator between the msgctxt and the msgid.

Now this is easy to fix for the currentMsgId variable. Just do the "\n" replacement *before* prefixing the msgctxt, not after. E.g., change:

QString currentMsgId = contextWithNewline + source.string;

to just

QString currentMsgId = source.string;

Then do the "\n" replacement on currentMsgId, and finally create a:

QString currentMsg = contextWithNewline + currentMsgId;

(And use currentMsg instead of currentMsgId in the userVisibleWordDiff() call below.)

But this doesn’t work for prevMsgId, which is defined as:

QString prevMsgId = entry.source.string;

It *already* includes the msgctxt (followed by "\n" followed by the msgid). The entry is an AltTrans object, and the entry.source is a CatalogString object. But I can’t seem to figure out where the code is that defines the content of the entry.source.string or how to instead extract the msgctxt and the msgid separately.

So I’m adding this bug report here. Hopefully someone more familiar with C++ / the Lokalize code base can figure this out.
Comment 1 Karl Ove Hufthammer 2024-10-04 16:44:35 UTC
Git commit 2eeec9cbbabf7f3e6a31d5b20a68104f39460aaa by Karl Ove Hufthammer, on behalf of Finley Watson.
Committed on 04/10/2024 at 16:37.
Pushed by huftis into branch 'master'.

Separate translation entry previous msgctxt into context string in AltTrans struct

AltTrans now has a "context" string containing the `#| msgctxt ` data
so the ATM entry diffs can be generated more correctly.

M  +49   -27   src/alttransview.cpp
M  +4    -2    src/catalog/alttrans.h
M  +47   -27   src/catalog/gettext/gettextstorage.cpp
M  +1    -1    src/catalog/ts/tsstorage.cpp
M  +1    -0    src/catalog/xliff/xliffstorage.cpp
M  +33   -37   src/common/diff.cpp

https://invent.kde.org/sdk/lokalize/-/commit/2eeec9cbbabf7f3e6a31d5b20a68104f39460aaa
Comment 2 Karl Ove Hufthammer 2024-10-04 18:50:58 UTC
Git commit 2647c6d49d32cec80075267efa519d8ab996d861 by Karl Ove Hufthammer, on behalf of Finley Watson.
Committed on 04/10/2024 at 18:50.
Pushed by huftis into branch 'release/24.08'.

Separate translation entry previous msgctxt into context string in AltTrans struct

AltTrans now has a "context" string containing the `#| msgctxt ` data
so the ATM entry diffs can be generated more correctly.
FIXED-IN: 24.08.2

(cherry picked from commit 2eeec9cbbabf7f3e6a31d5b20a68104f39460aaa)

M  +49   -38   src/alttransview.cpp
M  +4    -2    src/catalog/alttrans.h
M  +47   -27   src/catalog/gettext/gettextstorage.cpp
M  +1    -1    src/catalog/ts/tsstorage.cpp
M  +1    -0    src/catalog/xliff/xliffstorage.cpp
M  +33   -37   src/common/diff.cpp

https://invent.kde.org/sdk/lokalize/-/commit/2647c6d49d32cec80075267efa519d8ab996d861