Bug 467573 - [EXPERIMENTAL] Crash on 1660S when trying to use GPU for Whisper subtitles
Summary: [EXPERIMENTAL] Crash on 1660S when trying to use GPU for Whisper subtitles
Status: RESOLVED FIXED
Alias: None
Product: kdenlive
Classification: Applications
Component: Video Display & Export (show other bugs)
Version: unspecified
Platform: Arch Linux Linux
: NOR crash
Target Milestone: ---
Assignee: erjiang
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-19 16:57 UTC by calibre705
Modified: 2023-10-18 12:06 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
The log given by Kden upon trying to compute using Whisper on GPU. (1.93 KB, text/plain)
2023-03-19 16:57 UTC, calibre705
Details

Note You need to log in before you can comment on or make changes to this bug.
Description calibre705 2023-03-19 16:57:57 UTC
Created attachment 157420 [details]
The log given by Kden upon trying to compute using Whisper on GPU.

SUMMARY

Upon trying to use my GTX 1660S GPU for Whisper voice recognition (under subtitles), PYTorch crashes and no subtitles are produced.

STEPS TO REPRODUCE
(requires experimental build 526 of the appImage/Version 23.07.70 (rev. b7fd236cd))
1. Add speech to audio track
2. Go to Settings -> Configure Kdenlive -> Speech To Text and select Whisper as the required Speech Engine
3. Set the device to your GPU (GTX 1660s or other GTX 16XX series cards) (may require restart before it shows up)
4. Apply and ok the window
5. Select audio track
6. Go to Project -> Subtitles -> Speech recognition
7. Use any combination of settings in the dialog
8. Click on "show log" when it says it has finished, however the log shows that it has crashed and nothing has been added to the timeline

OBSERVED RESULT

No subtitles added, voice recognition crashed.

EXPECTED RESULT

Subtitles added, no crash.

SOFTWARE/OS VERSIONS:
Linux/KDE Plasma: Arch Rolling
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: Version 5.104.0
Qt Version: Version 5.15.8 (built against 5.15.8)

ADDITIONAL INFORMATION

Common problem on GTX 16XX series cards, options —no-half —precision=full —use-cudnn I believe are used as a workaround upon passing to PYTorch.
Comment 1 erjiang 2023-03-27 04:32:43 UTC
Searching the Web, there's a similar bug reported here against Whisper: https://github.com/openai/whisper/discussions/88

Seems like what you said about not using fp16 (half-precision) can work around the issue: maybe you can try modifying the whisper code to not use fp16 and see if that fixes it?
Comment 2 calibre705 2023-03-29 16:12:48 UTC
(In reply to erjiang from comment #1)
> Searching the Web, there's a similar bug reported here against Whisper:
> https://github.com/openai/whisper/discussions/88
> 
> Seems like what you said about not using fp16 (half-precision) can work
> around the issue: maybe you can try modifying the whisper code to not use
> fp16 and see if that fixes it?

Changing data/scripts/whispertosrt.py:
line 44: result = model.transcribe(source, task=sys.argv[5], language=sys.argv[6], verbose=False, fp16 = False
line 46: result = model.transcribe(source, task=sys.argv[5], verbose=False, fp16 = False)

Changing data/scripts/whispertotext.py:
line 47: result = model.transcribe(source, task=sys.argv[4], language=sys.argv[5], verbose=False, fp16 = False)
line 49: result = model.transcribe(source, task=sys.argv[4], verbose=False, fp16 = False)

This seems to fix it, GPU usage looks good, it's very fast. It took a fair bit of figuring out with the complete lack of documentation. However, this will be slower for non-16XX GPUs, a possible improvement would be to detect if it's a 16XX GPU being considered for use in order to use the different version. If you could make this into a commit, that would be great!
Comment 3 Bug Janitor Service 2023-04-01 19:52:54 UTC
A possibly relevant merge request was started @ https://invent.kde.org/multimedia/kdenlive/-/merge_requests/399
Comment 4 Jean-Baptiste Mardelle 2023-05-15 11:27:33 UTC
Git commit 856fdf59a631e53aa0ce94decd5d8f921c135f28 by Jean-Baptiste Mardelle.
Committed on 15/05/2023 at 11:27.
Pushed by mardelle into branch 'master'.

Add an option to manually disable FP16 on Whisper in settings page.

M  +8    -3    data/scripts/whispertotext.py
M  +0    -4    src/dialogs/kdenlivesettingsdialog.cpp
M  +6    -1    src/dialogs/speechdialog.cpp
M  +8    -3    src/dialogs/textbasededit.cpp
M  +4    -0    src/kdenlivesettings.kcfg
M  +19   -12   src/ui/configspeech_ui.ui

https://invent.kde.org/multimedia/kdenlive/commit/856fdf59a631e53aa0ce94decd5d8f921c135f28