Bug 492852 - Corrupt output file encoding since commit 19ca36b7fa135e5db107d63fe22197519be30441
Summary: Corrupt output file encoding since commit 19ca36b7fa135e5db107d63fe22197519be...
Status: RESOLVED FIXED
Alias: None
Product: kdiff3
Classification: Applications
Component: application (show other bugs)
Version: 1.11.3
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: michael
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-09-09 00:15 UTC by JATothrim
Modified: 2024-09-09 23:26 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description JATothrim 2024-09-09 00:15:43 UTC
SUMMARY
```
19ca36b7fa135e5db107d63fe22197519be30441 is the first bad commit
commit 19ca36b7fa135e5db107d63fe22197519be30441
Author: Michael Reeves <reeves.87@gmail.com>
Date:   Fri Aug 23 18:51:29 2024 -0400

    Only set BOM for utf-16 or utf-32

 src/EncodedDataStream.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
```

@mreeves 

I bisected to this commit and it's bad: This commit corrupts the output file encoding on (3-way) merges. I consider that after this commit the trust worthiness of the output of kdiff3 is pretty much zero: If the kdiff3 manages to fully auto-merge, the user won't see the silent corruption at worst, and at best this is a source of numerous other bugs down the line.

I started the bisect process after testing commit 2209f1ebe56ebe4f3e5be9c447337a618ab5707e which otherwise worked, but courrupts the output file.

I would suggest reverting this commit, as the parent commit c53d6cda1efedbe8b3deb12f8efd077f03cd32d3 has no such issues.

STEPS TO REPRODUCE
1.  (Three-way) merge a file
2.  Save the file at any point in kdiff3

OBSERVED RESULT

Editors show BOM as enabled, even if original (unmerged) encoding was UTF-8 with no BOM.

The entire file is silently modified - it looks mostly ""ok"" at glance, but the text encoding is all wonky, and comparing before/after with 'diff' shows the entire file got rewritten. E.g. Kate, KDevelop, Geany cannot highlight/parse a such source file/lines correctly.

EXPECTED RESULT

Unmodified lines are not touched, unsaved file(s) are not modified. diff shows expected difference between before/after resolving conflicts.
Comment 1 Bug Janitor Service 2024-09-09 17:57:23 UTC
A possibly relevant merge request was started @ https://invent.kde.org/sdk/kdiff3/-/merge_requests/61
Comment 2 JATothrim 2024-09-09 23:26:29 UTC
Git commit 000dd8cdf365098b11e9f733c0313bfb42b7d52a by Jarmo Tiitto.
Committed on 09/09/2024 at 17:46.
Pushed by mreeves into branch 'master'.

Revert "Only set BOM for utf-16 or utf-32"

This reverts commit 19ca36b7fa135e5db107d63fe22197519be30441.

M  +2    -2    src/EncodedDataStream.h

https://invent.kde.org/sdk/kdiff3/-/commit/000dd8cdf365098b11e9f733c0313bfb42b7d52a