Bug 490661

Summary: Whisper Speech to Text Error: NumPy: _ARRAY_API not found
Product: [Applications] kdenlive Reporter: candide <m8oxu4nv3>
Component: Video Effects & TransitionsAssignee: Jean-Baptiste Mardelle <jb>
Status: CONFIRMED ---    
Severity: normal CC: fritzibaby
Priority: NOR Keywords: triaged
Version First Reported In: 24.05.2   
Target Milestone: ---   
Platform: Microsoft Windows   
OS: Microsoft Windows   
Latest Commit: Version Fixed In: 24.12
Sentry Crash Report:

Description candide 2024-07-22 17:11:50 UTC
SUMMARY

I get this error when I try to use Whisper for Subtitles: 

Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:84.) t = torch.empty((0,), dtype=storage.dtype, device=storage._untyped_storage.device)


STEPS TO REPRODUCE
1. Import a video with sound (in English)
2. Use the Edit Subtitle Tool
3. Click on the Speech Recognition Magic Wand
4. Use the Base model, on the Full Project, and set "Autodetect" as the Language

The Python is set to "local install" because virtual env didn't work

OBSERVED RESULT
It says "Subtitles imported", but no subtitles are there

EXPECTED RESULT
Produces subtitles, like VOSK

SOFTWARE/OS VERSIONS
Windows: 10
KDE Frameworks Version: 6.3.0
Qt Version: 6.7.1

ADDITIONAL INFORMATION
Comment 1 Jean-Baptiste Mardelle 2024-08-13 10:59:22 UTC
Git commit 98d90275df62dcae1e5cb4718482151ca8bcbab3 by Jean-Baptiste Mardelle.
Committed on 13/08/2024 at 10:58.
Pushed by mardelle into branch 'release/24.08'.

Fix possible crash on python install and enforce correct packages for Windows Whisper

M  +1    -1    data/scripts/checkpackages.py
M  +1    -0    data/scripts/requirements-whisper-windows.txt
M  +1    -1    src/pythoninterfaces/abstractpythoninterface.cpp

https://invent.kde.org/multimedia/kdenlive/-/commit/98d90275df62dcae1e5cb4718482151ca8bcbab3
Comment 2 Jean-Baptiste Mardelle 2024-08-13 11:14:40 UTC
Thanks for your report. This is caused by an incompatibility between torch's latest version (2.4.0) and numpy / numba. 
These 2 last dependencies have to be downgraded. This can be done in a terminal with the following commands:
python.exe -m pip install -I numpy==1.26.0
python.exe -m pip install -I numba==0.59

This should downgrade the two packages and make whisper work again. For Kdenlive 24.08, clicking on the "Check for upgrade" button in Whisper settings should automatically fix it.
Comment 3 Bug Janitor Service 2024-08-28 03:47:38 UTC
๐Ÿ›๐Ÿงน โš ๏ธ This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information, then set the bug status to REPORTED. If there is no change for at least 30 days, it will be automatically closed as RESOLVED WORKSFORME.

For more information about our bug triaging procedures, please read https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging.

Thank you for helping us make KDE software even better for everyone!
Comment 4 candide 2024-09-02 22:53:35 UTC
(In reply to Jean-Baptiste Mardelle from comment #2)
> Thanks for your report. This is caused by an incompatibility between torch's
> latest version (2.4.0) and numpy / numba. 
> These 2 last dependencies have to be downgraded. This can be done in a
> terminal with the following commands:
> python.exe -m pip install -I numpy==1.26.0
> python.exe -m pip install -I numba==0.59
> 
> This should downgrade the two packages and make whisper work again. For
> Kdenlive 24.08, clicking on the "Check for upgrade" button in Whisper
> settings should automatically fix it.

Where is the "Check for upgrade" button? I'm running 24.08 now, ran the listed python commands, and I'm still getting an error:

```
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "C:\Program Files\kdenlive\bin\data\kdenlive\scripts\whispertosrt.py", line 118, in <module>
    sys.exit(main(sys.argv[1], # source AV file
  File "C:\Program Files\kdenlive\bin\data\kdenlive\scripts\whispertosrt.py", line 65, in main
    result = whispertotext.run_whisper(source, model, device, task, args)
  File "C:\Program Files\kdenlive\bin\data\kdenlive\scripts\whispertotext.py", line 53, in run_whisper
    model = whisper.load_model(model, device)
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\whisper\__init__.py", line 146, in load_model
    checkpoint = torch.load(fp, map_location=device)
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\torch\serialization.py", line 1025, in load
    return _load(opened_zipfile,
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\torch\serialization.py", line 1446, in _load
    result = unpickler.load()
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\torch\_utils.py", line 202, in _rebuild_tensor_v2
    tensor = _rebuild_tensor(storage, storage_offset, size, stride)
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\torch\_utils.py", line 180, in _rebuild_tensor
    t = torch.empty((0,), dtype=storage.dtype, device=storage._untyped_storage.device)
C:\Users\user\AppData\Roaming\Python\Python312\site-packages\torch\_utils.py:180: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:84.)
  t = torch.empty((0,), dtype=storage.dtype, device=storage._untyped_storage.device)
Traceback (most recent call last):
  File "C:\Program Files\kdenlive\bin\data\kdenlive\scripts\whispertosrt.py", line 118, in <module>
    sys.exit(main(sys.argv[1], # source AV file
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\kdenlive\bin\data\kdenlive\scripts\whispertosrt.py", line 65, in main
    result = whispertotext.run_whisper(source, model, device, task, args)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\kdenlive\bin\data\kdenlive\scripts\whispertotext.py", line 53, in run_whisper
    model = whisper.load_model(model, device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\whisper\__init__.py", line 154, in load_model
    model.set_alignment_heads(alignment_heads)
  File "C:\Users\user\AppData\Roaming\Python\Python312\site-packages\whisper\model.py", line 251, in set_alignment_heads
    mask = torch.from_numpy(array).reshape(
           ^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Numpy is not available
```
Comment 5 emohr 2024-12-01 17:25:04 UTC
Please try with the upcoming version 24.12. The whole Speech-to-Text part was refactored and should be more stable.