Bug 406652

Summary: Spell checker and pop-up menu have different ideas of what a word is
Product: [Applications] lokalize Reporter: Alexander Shpilkin <ashpilkin>
Component: editorAssignee: Simon Depiets <sdepiets>
Status: REPORTED ---    
Severity: normal CC: shafff
Priority: NOR    
Version First Reported In: 18.12.3   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Alexander Shpilkin 2019-04-18 12:35:33 UTC
SUMMARY
In an en-US text, the spell checker correctly treats an em dash (U+2014) without any spaces around it as a word boundary, but the selection region that appears when right-clicking on a misspelled word does not.

STEPS TO REPRODUCE
1. Open an XLIFF file with a target language of en-US.
2. Input "Karamzin—are" (or any other sequence of <non-dictionary word>, U+2014, <dictionary word>, without any spaces in between) into the translation box.
3. Observe the red underline under the first part and right-click on it.

OBSERVED RESULT
A menu with possible corrections pops up. Both words and the em dash between them is selected.

EXPECTED RESULT
A menu with possible corrections pops up. Only the first word, up to the em dash, is selected.

SOFTWARE/OS VERSIONS
Linux kernel: 5.0.5-arch1-1-ARCH
KDE Frameworks Version: 5.57.0-1
Qt Version: 5.12.2-1

ADDITIONAL INFORMATION
An obvious approach here would be to select whatever is has a red underline, but it seems to me that instead a generic word boundary routine of some sort is invoked, so there's no guarantee that it actually has the same idea of what a word is.  (Fixing that routine to recognize em end en dashes as word boundaries would be nice, but the proper solution _here_ would seem to be to use the region provided by the spell checker.)