Bug 491011 - Language autodetection after switching to the Mozhi
Summary: Language autodetection after switching to the Mozhi
Status: RESOLVED UPSTREAM
Alias: None
Product: Crow Translate
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Arch Linux Linux
: NOR normal (vote)
Target Milestone: ---
Assignee: Gena
URL:
Keywords: usability
Depends on:
Blocks:
 
Reported: 2024-07-30 07:01 UTC by Serge Roussak
Modified: 2024-09-12 07:57 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Screenshot with error (27.77 KB, image/png)
2024-09-09 07:19 UTC, Serge Roussak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Serge Roussak 2024-07-30 07:01:30 UTC
SUMMARY

In general, I personally would like to have ability to chose if I need to use the new engine or not. But, as usual, someone chose this instead of me... And, of course, in the name of my security... But, it's not 100% clear if the Mozhi (by the way, it sounds symbolically on the Russian: "brain" ;) ) in turn gather what I send via it to the Google or any other translator.

The second issue is in that the new approach works noticeably slower.

Though, it's "lyrics"...

But the main regression is in that the new feature doesn't detect the source language automatically for the translator engines for which this detection had worked fine in the pre-Mozhi era. E.g., for the Yandex translator. Furthermore, it doesn't work for the Google translator if I try to translate the text that contains the quotation marks. I'm not sure, probably there are another special characters that prevents the source language autodetection.

Finally, IMHO, the main engine switching in the software is a significant cause to change its version. But, I needed to search the commit which introduce this new feature in order to rollback to a version that doesn't call the storm of emotions. :)

STEPS TO REPRODUCE
1. Install the last Crow Translate version (Mozhi-driven).
2. Enable the source language autodetection.
3. Select the text containing the quotation mark(s).
4. Try to translate it with the Crow Translator.

OBSERVED RESULT
The error message about that the source language detection can't be performed. Even for the Google translator.

EXPECTED RESULT
Language autodetection should work.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Arch
KDE Plasma Version: 6.1.2
KDE Frameworks Version: 6.3.0
Qt Version: 6.7.2
Comment 1 Gena 2024-07-30 08:21:54 UTC
> But, it's not 100% clear if the Mozhi (by the way, it sounds symbolically on the Russian: "brain" ;) ) in turn gather what I send via it to the Google or any other translator.

The name pronounced as "moli" :) But Mozhi is much more secure. Instead of sending your data to an engine from your IP, it receives the IP of the instance. Consider it's as a proxy. You can even create your own instance if you worry about the instance being compromised.

> The second issue is in that the new approach works noticeably slower.

Try to choose an instance that is close to you. For me it depends on the instance.

But the main migration reason was maintainability, security and more engines support. Online engines often breaks and there is literally zero documentation of how to use API for free. We usually reverse-engineer the API by inspecting what browser do with web version. We decided easier to delegate it to the Mozhi team.

> Yandex translator. 

That's unfortunate, yes. Should be fixed on Mozhi side: https://codeberg.org/aryak/libmozhi/issues/3
I just don't know Go, so I can't contribute.
Regardless autodetection in google, I can't reproduce it. Works for me just fine.

> Finally, IMHO, the main engine switching in the software is a significant cause to change its version. 

It will be 3.0, I didn't draft a new release.
Ah, you probably installed it from https://flathub.org/apps/org.kde.CrowTranslate which is based on unpublished version. I submitted Flatpak manifest to prepare for the upcoming release, I didn't expect it to be published. I will contact the team about it.
Comment 2 Serge Roussak 2024-09-09 07:19:21 UTC
Created attachment 173470 [details]
Screenshot with error

See the picture.

Hint: I suppose, you use JSON or something like that and don't escape the quotation marks
Comment 3 Gena 2024-09-09 08:27:54 UTC
The text is escaped properly, the issue is response parsing.
Could you send me the output of the translation using CLI with `--json`?
Comment 4 Serge Roussak 2024-09-09 08:35:53 UTC
$  crow --json 'This is an example of the "quoted" text translation'
"Error: Error: Unable to parse autodetected language"
{
    "detected": "",
    "engine": "google",
    "source_antonyms": null,
    "source_equivalent_target_lang": {
    },
    "source_language": "auto",
    "source_synonyms": null,
    "source_transliteration": "",
    "target_antonyms": null,
    "target_equivalent_source_lang": {
    },
    "target_language": "en",
    "target_synonyms": null,
    "target_transliteration": "",
    "translated-text": "",
    "word_choices": null
}
Comment 5 Gena 2024-09-09 09:11:44 UTC
Ah, I see, so the response parsed correctly and the problem is in quotes. But I escape them on the app side, this is why I suspected that the bug was in parsing.

I tried using Mozhi web interface and it have the same issue. I reported it here: https://codeberg.org/aryak/libmozhi/issues/6, it's a bug on their side. They added autodetection for Yandex, BTW.
Comment 6 Serge Roussak 2024-09-09 10:12:41 UTC
OK, thank you.

Could you please drop a line here when the Mozhi will be fixed? I have no account at the codeberg (and I think, it'll never be needed for me).
Comment 7 Gena 2024-09-09 10:25:27 UTC
Sure :)
Comment 8 Gena 2024-09-12 07:57:21 UTC
Fixed, already available in https://mozhi.aryak.me instance.