COM component

Scene Detector

Video scene detection for developers. Integrate it into your application via COM.

Scene Detector is a COM component built on HandySaw DS scene recognition technology. It processes a video file and returns a list of detected scenes with frame-accurate boundaries. The component can also generate thumbnails for each scene and report newly found scenes in real time via callback interfaces. You do not have to wait for the full detection pass to finish.

A COM component follows standard architecture, making it usable from virtually any Windows development environment. If you need a standalone desktop tool rather than a developer component, HandySaw DS is the right choice.

Broad format support

Processes any video file playable by Windows Media Player via the DirectShow API.

Real-time callbacks

Reports each new scene immediately as it is found. Your application receives results during processing.

Thumbnail generation

Optionally saves the first and last frames of each scene as BMP or JPEG files at any scale.

RGB and YUV color spaces

Detection can run in RGB24 or YUY2 color space; each has its own recommended threshold defaults.

White and black scene detection

Configurable luma thresholds identify fade-to-white and fade-to-black transitions as separate scenes.

Partial file processing

Specify start and stop positions to detect scenes only within a chosen segment of the media file.

Custom filter pipeline

Override the source, splitter, and decoder DirectShow filters used internally for maximum compatibility.

Short-scene merging

Scenes shorter than a configurable minimum are automatically merged with their neighbors or removed.

Contact me for demo version and pricing information.

sdScenesMergeKind enum

Specifies how short scenes are handled after detection.

sdMergeDelete = 0 Short scene is removed.
sdMergeWithPrev = 1 Short scene is merged with the previous scene.
sdMergeWithNext = 2 Short scene is merged with the next scene.
sdMergeWithBoth = 3 Short scene, the previous scene, and the next scene are all merged into one.

DetectorParameters structure

Core scene detection engine parameters.

BSTR VideoFileName Full path to the video file to process.
int Threshold Detection sensitivity. Range 0–255. Recommended defaults: 19 (RGB) and 5 (YUV). A new scene begins when the difference between two consecutive frames exceeds this value. Lower values produce more scenes.
int UpLumaTresh Fade-to-white threshold. Range −1–255. A value of −1 disables this feature. When overall frame brightness exceeds this value, a new scene begins, grouping bright frames together.
int BottomLumaTresh Fade-to-black threshold. Range −1–255. A value of −1 disables this feature. When overall frame brightness falls below this value, a new scene begins, grouping dark frames together.
int MinSceneLength Minimum scene duration in frames. Scenes shorter than this value are modified according to the Merge field. Recommended default: 5.
int RegisterGraph ROT registration of the internal processing graph. 1: register. 0: do not register.
BSTR SourceFilterMoniker Display name of the desired source filter. NULL: use default.
BSTR SplitterFilterMoniker Display name of the desired splitter filter. NULL: use default.
BSTR DecoderFilterMoniker Display name of the desired decoder filter. NULL: use default.
sdScenesMergeKind Merge Merge strategy applied when a detected scene is shorter than MinSceneLength.

DetectorParameters2 structure

Extends DetectorParameters with color space selection and partial-file processing. Contains all fields of DetectorParameters plus the following.

int UseYUV Color space. 1: YUY2. 0: RGB24.
int Pad Reserved.
double StartPosition Processing start position in seconds. When non-zero, all returned scene times are relative to this offset.
double StopPosition Processing stop position in seconds. 0: process to end of file.

ThumbnailsParameters structure

Controls thumbnail generation. Pass a pointer to this structure to enable thumbnails, or NULL to disable.

int JpegFormat Output format. 0: BMP. 1: JPEG.
int JpegQuality JPEG compression quality, 0–100.
int ImagesPerScene 1: start frame only. 2: start and end frames per scene.
double Scale Thumbnail scale factor. 1.0: full size.
BSTR FileName Filename template using C printf syntax with one integer field, e.g. d:\dir\frame%05d.jpg. The frame number replaces the %d placeholder. Requires full path with correct extension.

ISceneDetector interface

Core scene detection interface.

DetectScenesInFile

HRESULT _stdcall DetectScenesInFile( [in] DetectorParameters *Params, [in] ThumbnailsParameters *ThumbnailsParams, [out] SAFEARRAY(long) *Scenes );

Main detection method. The caller does not need to create the SafeArray beforehand - pass a pointer to a SAFEARRAY variable. If *Scenes is non-NULL after the call, the caller must destroy it. The result is a two-dimensional array. For each scene, two long values give the start and stop frame numbers relative to the processing start position. Pass non-NULL ThumbnailsParams to enable thumbnail generation.

HRESULT Meaning
0Success
0x80040601Unspecified error
0x80040602DirectShow not installed or version too old
0x80040603Cannot obtain video duration - may be a still image
0x80040604Bad argument
0x80040605Cannot create device context
0x80040606Object is busy with another task
0x80040607Frame rate equals zero - may be a still image
0x80040608Cannot build graph

ISceneDetector2 interface

Extends ISceneDetector. Inherits all its methods and adds the following.

DetectScenesInFile2

HRESULT _stdcall DetectScenesInFile2( [in] DetectorParameters2 *Params, [in] ThumbnailsParameters *ThumbnailsParams, [out] SAFEARRAY(long) *Scenes );

Same as DetectScenesInFile, but uses DetectorParameters2, enabling color space selection and partial-file processing. Returned scene start and stop values are relative to Params.StartPosition.

GetFrameRate

HRESULT _stdcall GetFrameRate( [out, retval] double *pFrameRate );

Returns the frame rate of the last processed file. Useful for converting frame numbers to timestamps. Returns S_OK, or E_INVALIDARG if pFrameRate is NULL.

ISceneDetectorEvents interface

Callback interface for real-time detection events. Implement in your application to receive results as they are found.

NewScene

HRESULT NewScene( [in] long SceneIndex, [in] long Start, [in] long Stop );

Fired when a new scene is finalized. SceneIndex starts from 0. Start and Stop are frame numbers starting from 0. When processing was started via DetectScenesInFile2, both values are relative to Params.StartPosition.

Status

HRESULT Status( [in] long ScenesFound, [in] long CurrentFrame, [in] long TotalFrames, [out] long *AbortProcess );

Fired every 50 ms with current processing progress. Set *AbortProcess to 1 to cancel the ongoing detection.