Bug 356452

Summary: Simon can't add Arabic or Hebrew words
Product: [Applications] simon Reporter: Christopher Edward <christopher.edward>
Component: generalAssignee: Mario Fux <kde-ml>
Status: CONFIRMED ---    
Severity: major CC: spunkysos
Priority: NOR    
Version: 0.4.1   
Target Milestone: ---   
Platform: Other   
OS: All   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Christopher Edward 2015-12-10 01:31:17 UTC
After adding Arabic Language words to the Active Vocabulary, Simon gets stuck while trying to compile the new model and fails.
I can't compile a model with an Arabic word in the Vocabulary, such as السماء


Reproducible: Always

Steps to Reproduce:
1.open Scenario/ Word list/ Active Vocabulary/ Add Word/ 
2.Then add an Arabic word such as ( السماء ) and train it.
3.click Actions/ Synchronize

Actual Results:  
an error message appears:
Kmeans_init.exe has stopped working

Expected Results:  
The software should have finished Compiling Model without errors.

Also I can't compile a model with a symbols or marks in the vocabulary such as . / ? , : 
Therefore, I can't compile a new model with words other than the English Language in the Active Vocabulary.
Comment 1 Donald 2015-12-18 15:53:15 UTC
I CONFIRM this bug.
Can't Synchronize or Compile or Activate a Module with Arabic Language words or letters.
Comment 2 Mario Fux 2016-01-07 12:08:32 UTC
Thanks Christopher and Edward for this bug report and confirmation. I hope to confirm it myself in the next week or two and then fix it for a final release of the Qt4 based version of Simon somewhen in Q1/Q2 2016.
Comment 3 Christopher Edward 2016-01-19 05:22:47 UTC
Did you confirm the bug?
You can also use symbols such as / . ? ! in the active dictionary, it will produce the same error, no languages other than English letters can be used. 

How close are you from releasing a Qt4 version of Simon, and it will fix?

Do you need financial support (Donations) to speedup the development process?
Comment 4 Christopher Edward 2016-01-19 05:28:10 UTC
Can you confirm the bug?
You can also use symbols such as / . ? ! in the active dictionary instead of arabic or hebrew, it will produce the same error, no languages other than English letters can be used. 

How close are you from releasing a Qt4 version of Simon, and what it will fix?

Do you need financial support (Donations) to speedup the development process?
Comment 5 Mario Fux 2017-02-02 12:58:55 UTC
So finally coming back to this bug. There is now again some life in Simon and we're working towards preparing a last Qt4-based version. A goal would of course also be to fix all these bugs.

Might I ask you if you are still able to trigger this bug with the current master version of Simon (from git)?

Thanks and thanks for your patience
Mario
Comment 6 Christopher Edward 2017-02-07 17:35:30 UTC
(In reply to Mario Fux from comment #5)
> So finally coming back to this bug. There is now again some life in Simon
> and we're working towards preparing a last Qt4-based version. A goal would
> of course also be to fix all these bugs.
> 
> Might I ask you if you are still able to trigger this bug with the current
> master version of Simon (from git)?
> 
> Thanks and thanks for your patience
> Mario

Wow, thanks for replying!
Sorry, I dont know how to build/compile from source, as i was using the pre-compiled Windows and Linux versions.
Can you provide an Alpha/beta version for me to test extensively.
Thank you
Comment 7 Mario Fux 2017-02-07 19:18:10 UTC
Morning Edward

Hope to have some tarballs in the coming weeks and then I'll try to keep in mind to ping you about this.

Mario
Comment 8 Christopher Edward 2017-02-08 12:09:26 UTC
(In reply to Mario Fux from comment #7)
> Morning Edward
> 
> Hope to have some tarballs in the coming weeks and then I'll try to keep in
> mind to ping you about this.
> 
> Mario

Please release an Alpha version for windows
Thanks Mario for your hard work
Comment 9 Mario Fux 2017-04-10 11:17:34 UTC
Morning Edward

We released the Alpha version meanwhile but it's available just in source form:
https://blogs.kde.org/2017/04/03/simon-0480-alpha-released

I talked briefly with some KDE Windows people and we need to postpone a new Windows installer to the time after the KF5 port. But feel free to compile it yourself on Windows and we're open for feedback.

Sorry for the rather bad information for the moment.

Best regards
Mario
Comment 10 Mario Fux 2017-05-09 13:06:24 UTC
Ok now back to this bug. Christopher, I can add the word () bug get a sphinxtrain error as well. The one below:

You wrote this:
an error message appears:
Kmeans_init.exe has stopped working

Does this mean that "Kmeans_init.exe has stopped working" is your error message?

Thanks for a quick feedback
Mario

/usr/bin/sphinxtrain -t default{201cb64d-4f61-49b4-9ed2-b3ddb536f6f0} setup
Sphinxtrain path: /usr/lib/sphinxtrain
Sphinxtrain binaries path: /usr/lib/sphinxtrain
Setting up the database default{201cb64d-4f61-49b4-9ed2-b3ddb536f6f0}

/usr/bin/sphinxtrain run
Sphinxtrain path: /usr/lib/sphinxtrain
Sphinxtrain binaries path: /usr/lib/sphinxtrain
Running the training

Configuration (e.g. etc/sphinx_train.cfg) not defined
Compilation failed in require at /usr/lib/sphinxtrain/scripts/000.comp_feat/slave_feat.pl line 51.
BEGIN failed--compilation aborted at /usr/lib/sphinxtrain/scripts/000.comp_feat/slave_feat.pl line 51.
Comment 11 Christopher Edward 2017-05-14 12:35:07 UTC
(In reply to Mario Fux from comment #10)
> Ok now back to this bug. Christopher, I can add the word () bug get a
> sphinxtrain error as well. The one below:
> 
> You wrote this:
> an error message appears:
> Kmeans_init.exe has stopped working
> 
> Does this mean that "Kmeans_init.exe has stopped working" is your error
> message?
> 
> Thanks for a quick feedback
> Mario
> 
> /usr/bin/sphinxtrain -t default{201cb64d-4f61-49b4-9ed2-b3ddb536f6f0} setup
> Sphinxtrain path: /usr/lib/sphinxtrain
> Sphinxtrain binaries path: /usr/lib/sphinxtrain
> Setting up the database default{201cb64d-4f61-49b4-9ed2-b3ddb536f6f0}
> 
> /usr/bin/sphinxtrain run
> Sphinxtrain path: /usr/lib/sphinxtrain
> Sphinxtrain binaries path: /usr/lib/sphinxtrain
> Running the training
> 
> Configuration (e.g. etc/sphinx_train.cfg) not defined
> Compilation failed in require at
> /usr/lib/sphinxtrain/scripts/000.comp_feat/slave_feat.pl line 51.
> BEGIN failed--compilation aborted at
> /usr/lib/sphinxtrain/scripts/000.comp_feat/slave_feat.pl line 51.

The error message is: "Kmeans_init.exe has stopped working"

Now I retested on 0.4.1 and another error showed up, have a look:
https://youtu.be/aLSiF2y2FvQ
Comment 12 Mario Fux 2017-05-16 18:18:18 UTC
Morning Christopher

Thanks for trying again. I'll try to reproduce this on a Windows machine in the next days. You're still using the 0.4.1 version of Simon, right?

And could you try to create a file named with the arabic word in the folder the error message mentions. Either windows can't create these kind of files or Simon makes something wrong.

Thanks
Mario

Thanks
Mario
Comment 13 Mario Fux 2017-05-16 18:39:43 UTC
Ok, just checked with Simon 0.4.1 on Windows. Can confirm it. Looks like a problem to write a file name with other chars then ASCII ones. That's my current guess. Need to dig into the code to see where and how we create this wav files.
Comment 14 Christopher Edward 2017-05-25 05:56:22 UTC
(In reply to Mario Fux from comment #13)
> Ok, just checked with Simon 0.4.1 on Windows. Can confirm it. Looks like a
> problem to write a file name with other chars then ASCII ones. That's my
> current guess. Need to dig into the code to see where and how we create this
> wav files.

Here is a txt file with 3 Arabic words UTF-8:
https://www.dropbox.com/s/x12xxn2dfqymznc/Some%20Arabic%20Words.txt?dl=0
https://www.dropbox.com/s/3tcshyptp2tz71z/Arabic%20words.docx?dl=0

As I said, using any characters other than English will result in fail.