Summary: | Kdenlive mask generation plugin crashes when using CUDA | ||
---|---|---|---|
Product: | [Applications] kdenlive | Reporter: | Paul Brown <paul.brown> |
Component: | Video Effects & Transitions | Assignee: | Jean-Baptiste Mardelle <jb> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | snd.noise |
Priority: | NOR | ||
Version First Reported In: | git-master | ||
Target Milestone: | --- | ||
Platform: | Flatpak | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Paul Brown
2025-02-05 20:41:44 UTC
I can reproduce the crash but can't generate any log because Kdenlive freezes when run with gdb rather than crashing. There is a memory issue with SAM2 when trying to process a video longer than a few seconds. How long is the zone you are trying to apply the mask ? Does it work it you try to create a mask for like 10-20 frames ? With a few frames it does work but with something like 2 seconds I get a freeze, crash or error: Resize Array, COLS: 1 NumPy Array: {0: array([[2239, 1293]])} NumPy Array: {0: array([1])} using device: cuda:0 Traceback (most recent call last): File "/usr/share/kdenlive/scripts/automask/sam-objectmask.py", line 104, in <module> sam2_model = build_sam2(model_cfg, sam2_checkpoint, device=device) File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/sam2/build_sam.py", line 94, in build_sam2 model = model.to(device) File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1343, in to return self._apply(convert) ~~~~~~~~~~~^^^^^^^^^ File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/torch/nn/modules/module.py", line 903, in _apply module._apply(fn) ~~~~~~~~~~~~~^^^^ File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/torch/nn/modules/module.py", line 903, in _apply module._apply(fn) ~~~~~~~~~~~~~^^^^ File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/torch/nn/modules/module.py", line 903, in _apply module._apply(fn) ~~~~~~~~~~~~~^^^^ [Previous line repeated 4 more times] File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/torch/nn/modules/module.py", line 930, in _apply param_applied = fn(param) File "/home/farid/.local/share/kdenlive/venv-sam/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1329, in convert return t.to( ~~~~^ device, ^^^^^^^ dtype if t.is_floating_point() or t.is_complex() else None, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ non_blocking, ^^^^^^^^^^^^^ ) ^ torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 5.76 GiB of which 9.31 MiB is free. Including non-PyTorch memory, this process has 286.00 MiB memory in use. Process 32606 has 296.00 MiB memory in use. Process 32629 has 298.00 MiB memory in use. Process 32651 has 358.00 MiB memory in use. Process 32676 has 738.00 MiB memory in use. Process 32697 has 1020.00 MiB memory in use. Process 32715 has 1020.00 MiB memory in use. Process 32734 has 1.12 GiB memory in use. Process 32758 has 698.00 MiB memory in use. Of the allocated memory 188.77 MiB is allocated by PyTorch, and 5.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) (In reply to Jean-Baptiste Mardelle from comment #2) > There is a memory issue with SAM2 when trying to process a video longer than > a few seconds. > How long is the zone you are trying to apply the mask ? 22 seconds. > Does it work it you > try to create a mask for like 10-20 frames ? Yes. I cut the video down to 5 seconds (125 frames) and it renders the mask no problem. I just push a rather large update to the object segmentation module. In Kdenlive Settings > Plugins > Object Detection, there is now a checkbox "Offload video to CPU to save GPU Memory". This causes SAM2 to use the RAM instead of the VRAM, which should allow you to create longer masks. For me, on GPU with 12Gb, I could create a mask with a maximum of about 300 frames in Full HD. With the offload option, I can go up to 700 frames (on a 32Gb RAM system). Please check and let me know if it improved things for you. I also improved user feedback during the process and if the process crashes, you should be able so see a log. Don't have a crash anymore after the changes. Thanks JB (In reply to Jean-Baptiste Mardelle from comment #5) > I just push a rather large update to the object segmentation module. In > Kdenlive Settings > Plugins > Object Detection, there is now a checkbox > "Offload video to CPU to save GPU Memory". This causes SAM2 to use the RAM > instead of the VRAM, which should allow you to create longer masks. For me, > on GPU with 12Gb, I could create a mask with a maximum of about 300 frames > in Full HD. With the offload option, I can go up to 700 frames (on a 32Gb > RAM system). Please check and let me know if it improved things for you. > > I also improved user feedback during the process and if the process crashes, > you should be able so see a log. Works for me too. Thanks! |