Bug 423976

Summary: When formatting IDs, non-word characters should be used as word separators
Product: [Applications] KBibTeX Reporter: nobodyinperson <nobodyinperson>
Component: User interfaceAssignee: Thomas Fischer <fischer>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: git (master)   
Target Milestone: ---   
Platform: Manjaro   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description nobodyinperson 2020-07-07 19:17:32 UTC
SUMMARY

When formatting IDs, non-word characters (like dashes, colons, etc...) should be used as word separators. The current behaviour is to split words at whitespace, resulting in non-word characters ending up as part of words in the IDs, which can break autosuggestion in editors and generally doesn't look nice. IDs should be concise and not contain special characters as they are used for in-code referencing. If a user wants special characters in between words in IDs, there is a configuration option already provided in the ID suggestion editor.

STEPS TO REPRODUCE

1. Use the following example bibtex entry:

@article{testarticle,
	author = {Doe, John},
	title = {{Long-Term Measurements: A Better Technique}},
	year = {2020}
}


2. Create an ID suggestion like "Alw00|Y|Tlw01" (taken from ~/.config/kbibtexrc)  (first lowercased author, 4-digit year, then all title words with small words removed) and set it as default.

3. Format the ID of the entry.

OBSERVED RESULT

The ID is formatted as "doe2020long-termmeasurements:bettertechnique"

EXPECTED RESULT

ID gets formatted as "doe2020longtermmeasurementsbettertechnique"

SOFTWARE/OS VERSIONS

up-to-date Manjaro XFCE 

ADDITIONAL INFORMATION

kbibtex-git built from the AUR with this PKGBUILD fix: https://aur.archlinux.org/packages/kbibtex-git/#comment-754938
Comment 1 Thomas Fischer 2020-07-08 19:16:26 UTC
Patch is under way, see merge request 1 at invent.kde.org:
https://invent.kde.org/office/kbibtex/-/merge_requests/1
Comment 2 Thomas Fischer 2020-07-09 10:04:40 UTC
Git commit 6d6ba2fb63308b4f929a94741ea32d0f066b0925 by Thomas Fischer, on behalf of Yann Büchau.
Committed on 09/07/2020 at 10:04.
Pushed by thomasfischer into branch 'master'.

ID suggestions: separate words correctly, not only by whitespace

- Use \W+ as title/journal word separator instead of only whitespace
- Enable Unicode support for QRegularExpression

M  +6    -6    src/processing/idsuggestions.cpp

https://invent.kde.org/office/kbibtex/commit/6d6ba2fb63308b4f929a94741ea32d0f066b0925
Comment 3 Thomas Fischer 2020-07-09 12:43:42 UTC
Git commit 60584af6ff3dc623cb9b46d1d41460a939cfad9e by Thomas Fischer, on behalf of Yann Büchau.
Committed on 09/07/2020 at 12:29.
Pushed by thomasfischer into branch 'kbibtex/0.10'.

ID suggestions: separate words correctly, not only by whitespace

- Use \W+ as title/journal word separator instead of only whitespace
- Enable Unicode support for QRegularExpression

M  +6    -6    src/processing/idsuggestions.cpp

https://invent.kde.org/office/kbibtex/commit/60584af6ff3dc623cb9b46d1d41460a939cfad9e