Bug 169118 - terminology homogeneization on source text and target text based on a glossary file.
Summary: terminology homogeneization on source text and target text based on a glossar...
Status: RESOLVED FIXED
Alias: None
Product: lokalize
Classification: Applications
Component: general (other bugs)
Version First Reported In: unspecified
Platform: Debian testing Linux
: NOR wishlist
Target Milestone: ---
Assignee: Simon Depiets
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-14 13:24 UTC by mvillarino
Modified: 2018-09-29 11:09 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mvillarino 2008-08-14 13:24:59 UTC
Version:            (using KDE 4.1.0)
Installed from:    Debian testing/unstable Packages

This is closely related to wishlist 65031 (by a Finnish translator).

The idea is to have a check that new started, searches on each source text (msgid) for words or expressions on the glossary, and for each match, it do the same on target text (msgstr), finding. Then, it reports, or highlights those words on the source text that have not been properly translated.
Here "properly translation" means that if the source word appears on the glossary, then a translation for it, as contained on the glossary, must appear on the target text.

This is more complicated than it may seems, given that the glossary is supposed to contain some kind of "canonical form" of words and expresions, like infinitive verbs, so it is quite probably that a pre-pre-parsing of texts is needed, splitting out xml tags, shortcuts, and things like that, then a pre-parsing, which most possibly imply lemmatization of texts and glossary entries, and then the parsing itself, with word/expr matching.
Comment 1 Nick Shaforostoff 2008-09-14 01:42:02 UTC
I bet that the glossary will almost always contain entries that you would like to exclude from this kind of search.

I'm going to implement saving TM searches (search options: shell-like expressions for source, target, inversion bools for source and target independently, filemasks) and capability of running them all or their selection at once.

Then I'll add feature to generate list of searches based on a glossary (including falseFriends to-be-added-to-lokalize-tbx-editor), which can then be edited. As always it will be shareable and I'll encourage putting such qa-lists into lang dirs in svn repository.
Comment 2 Nick Shaforostoff 2008-09-14 01:49:15 UTC
ah, and search list generation  stage would of course use snowball stemmer  http://snowball.tartarus.org/
Comment 3 Adrián Chaves (Gallaecio) 2018-05-27 11:58:37 UTC
I feel that this would be better done by a further integration of Pology’s check-rules with Lokalize.
Comment 4 Simon Depiets 2018-09-29 11:09:25 UTC
Resolved by pology integration
https://phabricator.kde.org/D15759