457160 – Speech Recognition fails with vosk-model-en-us-0.21

Bug 457160 - Speech Recognition fails with vosk-model-en-us-0.21

Summary: Speech Recognition fails with vosk-model-en-us-0.21

Status:	RESOLVED NOT A BUG

Alias:	None

Product:	kdenlive
Classification:	Applications
Component:	Video Effects & Transitions (other bugs)
Version First Reported In:	21.04.3
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Vincent PINON

URL:
Keywords:

Depends on:
Blocks:

Reported:	2022-07-26 13:05 UTC by Ian
Modified:	2022-10-13 08:13 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Ian 2022-07-26 13:05:32 UTC

SUMMARY
***
NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols.
See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports
***
Speech Recognition works OK with vosk-model-en-us-daanzu-20200905 - although the accuracy leaves something to be desired.  I thought I would try with a bigger 'speech model' to see if that would work better - but when I tried kdenlive just terminated like a 'Mission Impossible Cassette'.

STEPS TO REPRODUCE
1. Configure kdenlive to use either vosk-model-en-us-daanzu-20200905 or vosk-model-en-us-0.21
2. Load a video clip with English speech available
3. Set the 'selection bar' as required
4. Project->Subtitles->Speech Recognition

OBSERVED RESULT
If using vosk-model-en-us-daanzu-20200905 a set of subtitles are produced. OK!
If using vosk-model-en-us-0.21 the program terminates without giving any warning or obvious error message.

EXPECTED RESULT
I would expect either model to produce a set of subtitles.  If there is a problem, I would expect a message to advise me of the problem/

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma:  Linux Fedora 35 (x86_64)
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION

Comment 1 erjiang 2022-07-28 03:45:21 UTC

Oof, got `Floating point exception (core dumped)` using Kdenlive 22.12.3 AppImage (Ubuntu) with vosk-model-en-us-0.22.

However, I wasn't able to reproduce it with 22.04.3. Can you test with a newer version to see if it's been fixed already?

Comment 2 Bug Janitor Service 2022-08-12 04:35:54 UTC

Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!

Comment 3 Ian 2022-08-14 16:36:46 UTC

I expected that a newer version of kdenlive would upgrade automatically - but discovered today that I have to initiate it manually.

I installed version 22.04.03 and tested it with vosk-model-en-us-daanzu-20200905, vosk-model-en-us-0.21 and with vosk-model-en-us-0.22.

With vosk-model-en-us-daanzu-20200905 it behaved as before - an acceptable (although occasionally inaccurate) subtitle file.

With vosk-model-en-us-0.21 the same sudden termination as before with absolutely no report of failure (although on re-starting kdenlive it knew that there had been a failure)

With vosk-model-en-us-0.22 I got a message that the speech recognition had aborted (in the dialogue box where would normally say 'Speech Recognition has started').

Comment 4 Ian 2022-10-08 10:38:16 UTC

Today I installed kdenlive 22.08.1 and tested Speech Recognition.

With vosk-model-en-us-daanzu-20200905, subtitles were produced OK.
WIth vosk-model-en-us-0.21, subtitles were produced OK (different/better than with the other model, but that was what I expected).
With vosk-model-en-us-0.22 kdenlive crashed without reporting any problem.

Comment 5 erjiang 2022-10-09 05:49:08 UTC

(In reply to Ian from comment #4)
> Today I installed kdenlive 22.08.1 and tested Speech Recognition.
> 
> With vosk-model-en-us-daanzu-20200905, subtitles were produced OK.
> WIth vosk-model-en-us-0.21, subtitles were produced OK (different/better
> than with the other model, but that was what I expected).
> With vosk-model-en-us-0.22 kdenlive crashed without reporting any problem.

I was able to successfully use vosk-model-en-us-0.22 using kdenlive 22.08.1 AppImage on Ubuntu 22.04.1, so maybe we should look at the differences between your setup and mine. Could you let us know what kdenlive package (AppImage, Flatpak, PPA) you're using and what operating system you're on? Do you have enough memory for this model? (A casual test with vosk-model-en-us-0.22 showed it peaking at 6GB of memory usage).

Comment 6 Ian 2022-10-10 09:50:15 UTC

My PC has 8GB memory (actually, it seems that half a GB is taken for the display).
I have just ordered 16GB memory and I will see what effect that has.

Comment 7 Ian 2022-10-13 08:13:40 UTC

Before I installed the new memory, I had one more try using vosk-model-en-us-0.22 and running System Monitor to watch the memory utilization.  Memory use went up to 98% and Swap went up to over 50% before everything froze.  I fixed that by powering the machine off.

I installed the new 16GB of memory and repeated the test (including System Monitor).  This time, memory use got to just over 50% and Swap was not used and everything worked perfectly.

Problem Solved!  The only problem was that the program would crash without an explanation.