Bug 515492 - Add digiKam AI Face Recognition for Video with SRT Sidecar files option.
Summary: Add digiKam AI Face Recognition for Video with SRT Sidecar files option.
Status: REPORTED
Alias: None
Product: digikam
Classification: Applications
Component: Faces-Recognition (other bugs)
Version First Reported In: 8.8.0
Platform: Other Other
: NOR wishlist
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2026-02-04 10:22 UTC by Chris Hernandez
Modified: 2026-02-04 11:56 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Hernandez 2026-02-04 10:22:51 UTC
I would like to propose an extension of digiKam’s "People" Face Management engine to support video files. Currently, digiKam is a leader in image metadata, but video "People" tagging remains a manual process. This feature would leverage existing AI models (Yolo/OpenVINO) to scan video files and generate time-coded face data.

Core Functional Requirements:
Video Face Scanning: 
Use a configurable interval (default: 1s) or keyframe-based analysis to detect and recognize faces within video containers. Leveraging libraries already present in Kdenlive (for frame extraction/tracking) could potentially reduce redundant development.

Probability Grouping: 
Detected faces should be grouped in the "People" sidebar based on match certainty, similar to the current image workflow, allowing for bulk confirmation or rejection.
MWG Metadata Embedding: Once confirmed, names should be written to the video's XMP metadata (Keywords/PersonInImage) using the ExifTool backend.

SRT Face-Appearance Generation: 
A unique feature to export appearance timestamps as SRT sidecar files [filename]_([face tag]).srt. This allows standard video players (VLC, etc.) to display "Face Subtitles" or allow users to search for specific appearances.


Use Case and Benefit:
This would make digiKam the first open-source DAM to offer "Face-Searchable" video. For users with large archives, this solves the problem of finding a specific person inside hours of video without having to watch the footage manually.

Technical Suggestions:
Provide a "Minimum interval between detections" setting to prevent SRT bloat.
For uncompressed or high-bitrate video where keyframes are sparse, allow a fallback to a fixed temporal interval (e.g., scan every 1 seconds).


[video_file_name]_([face tag]).srt
<begin SRT file contents>
NOTE
This SRT file shows all instances of [face tag] found in [filename]
Minimum keyframe interval - [#] second(s)
Generated by [user] with digiKam Video AI

1
$[HH:MM:SS,mmm] --> [HH:MM:SS,mmm]
$[face tag] - [x, y, w, h]

2
$[HH:MM:SS,mmm] --> [HH:MM:SS,mmm]
$[face tag] - [x, y, w, h]