Support Mistral AI’s new open-weight Voxtral models as an alternative STT engine for automatic subtitling. Key Benefits: * Higher Accuracy: Outperforms Whisper Large-v3 in speed and Word Error Rate (WER). * Native Diarization: Built-in speaker identification to automatically label different voices in transcripts. * Efficiency: Optimized for local hardware; the Mini-3B model provides high-quality results with low VRAM usage. * Privacy/License: Apache 2.0 license, allowing for fully offline, private processing. Proposed Integration: Add "Voxtral" to the STT engine list in Settings > Speech to Text, with model selection (Mini/Small) and a toggle for speaker diarization.
a demo you can find here: https://huggingface.co/spaces/pandora-s/Voxtral-Subtitles