Summary: | When searching for words in msgstr there should be an option to ignore some characters | ||
---|---|---|---|
Product: | [Unmaintained] kbabel | Reporter: | A Al-Arfaj <aalarfaj> |
Component: | general | Assignee: | Stanislav Visnovsky <visnovsky> |
Status: | RESOLVED UNMAINTAINED | ||
Severity: | wishlist | CC: | cfeck, sanderkoning |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Mandrake RPMs | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
A Al-Arfaj
2004-04-15 11:06:51 UTC
I might add that what I mean by inflections is composing characters. Arabic uses a lot of them which are not really essential (but they're nice when used). One example of this is the "fatha" (0x064E). Here is an example in UTF-8: صور صوَر Here these two are the same word, but the second one is more decorated, using the "fatha" on the middle character. It would be nice if the search tool would just ignore this composing character and still consider them the same word. You don't need to hardwire which characters should be ignored. It would be nice if the search tool would give us an open option where we could specify which composing characters to ignore. Thank you. This could be very handy for other purposes as well. I've already had numerous occasions where I could have had a correct automatic translation, if there had not been an extra ':' at the end of one of the two items. An option to ignore the colon (in this case) would make life a lot easier for those messages where an exact translation does exist, apart from an extra (or missing) punctuation character. why not just use regex feature? from wikipedia: 'For example, the set containing the three strings "Handel", "Händel", and "Haendel" can be described by the pattern H(ä|ae?)ndel' regarding Comment #2: kaider's batch translation implementation fuzzy-translates such strings (lacking or having additional punct symbol). (they have 99% score) Shaforostoff, can you please suggest a regex to ignore diacritics so that Input: ض Matches: ضَ ضُ ضٌ ضْ ضِ ضٍ ض Input: a Matches: a à á â ã ä å Input: u Matches: ù ú û ü ũ ū ŭ ů ű ų Input: r Matches: r ŕ ŗ ř ? I am not sure if this is possible with regex, but I know it won't be easy for a normal user to figure it out. On all MS applications, there is an option on the Find or Replace dialog boxes to ignore diacritics. It's like ignoring Case, even if possible with regex, it's badly needed that they put an easy way to access it. KBabel is no longer maintained, please use the KDE 4 translator's tool called "Lokalize" instead. For more information, please visit http://userbase.kde.org/Lokalize If this is a request for a feature which is also missing in Lokalize, please add a comment so that I can reassign the request to the Lokalize authors. You could also file a new request for Lokalize. |