Scene Detector – Video Scene Detection COM Object


Scene Detector is a powerful COM component for scene recognition and scene processing. It incorporates award-winning HandySaw DS technologies for extremely accurate and fast optical scene recognition. It is oriented towards software companies and developers who want to extend their products. A COM component is a reusable software module that follows the Component Object Model (COM) architecture, allowing it to be used in various programming languages and environments. If you're looking for a standalone application for scene detection, you can use HandySaw DS by following this link.

Scene Detector object processes video file and returns a list of found scenes. It can also save thumbnails for each scene and inform the caller application immediately about each new found scene via callback interfaces during processing.


Main features

  • Detects scenes in almost any video file that can be played by Windows Media Player (due to use DirectShow API)
  • Fast and accurate algorithm
  • Easy integration
  • Reports new scene right after it is defined, during processing. No need to wait for detection to finish
  • Can generate thumbnails - first and last frames images for each scene and save them in BMP or JPG files
  • Possible to specify source, splitter, and video decoder filters to use during processing
  • Can detect "white" and "black" scenes
  • Can work in RGB and YUV color spaces
  • Can process part of a media file - user may specify start and stop processing positions


Demo version and price

Contact me.




sdScenesMergeKind enum

Used to give instructions what to do with short scenes.

sdMergeDelete = 0, Short scene removed
sdMergeWithPrev = 1, Short scene merged with previous
sdMergeWithNext = 2, Short scene merged with next
sdMergeWithBoth = 3 Short (middle), previous and next scenes merged into one single scene


DetectorParameters structure

This structure contains scene detection engine parameters.

BSTR VideoFileName; Full video file name for detection.
int Threshold; Detection threshold value.
Recommended defaults are 19 for RGB color space and 5 for YUV.
Can be from 0 to 255.
This parameter is the main.
When difference of two frames is larger then this value, new scene begin.
Thus: lower Threshold - more scenes.
int UpLumaTresh; Used in "white fadeouts" detection.
Recommended default -1.
Can be from -1 to 255.
"-1" disables this feature.
When overall frame brighness is larger then this value, new scene begin.
Whith this feature it is possible to define sequence of such bright frames as separate scene.
int BottomLumaTresh; Used in "black fadeouts" detection.
Recommended default -1.
Can be from -1 to 255.
"-1" disables this feature.
When overall frame brighness is less then this value, new scene begin.
Whith this feature it is possible to define sequence of such dark frames as separate scene.
int MinSceneLength;

Sets a minimum duration of a detected scene (in frames).
Recommended default 5.
Can be from 0 to any positive value.
If length of the defined scene is less than this value, this scene merged with previous, next or both scenes, according to Merge field value(see below).

int RegisterGraph; Control registration of internal processing graph in he Running Object Table(ROT).
Can be 1(register in ROT) or 0(does not register).
BSTR SourceFilterMoniker; Display name of the desired source filter moniker used in the internal processing graph.
If NULL - default filter used.
BSTR SplitterFilterMoniker; Display name of the desired splitter filter moniker used in the internal processing graph.
If NULL - default filter used.
BSTR DecoderFilterMoniker; Display name of the desired decoder filter moniker used in the internal processing graph.
If NULL - default filter used.
sdScenesMergeKind Merge; When scene length is less then MinSceneLength, scene modified according to value of this parameter.


DetectorParameters2 structure

This structure expands a set of scene detection engine parameters defined in DetectorParameters structure. In addition to members of DetectorParameters it contains:

int UseYUV; Allows to select color space for scene detection.
Can be 1(YUY2) or 0(RGB24).
int Pad; Reserved
double StartPosition; Specifies start media position for scene detection in seconds.
If this is nonzero then do not forget that all scene times are relative to this position.
double StopPosition; Specifies stop media position for scene detection in seconds.
Set to zero to ignore.


ThumbnailsParameters structure

This structure contains thumbnails generation parameters.

int JpegFormat; Select file format. 0 - generate BMP files, 1 - generate JPEG files.
int JpegQuality; JPEG compression quality. Integer from 0 to 100.
int ImagesPerScene; Images per scene number. 1 - only start frame, 2 - start and end frames of each scene.
double Scale; Thumbnail scaling factor. 1 - fullsize picture. Floating point value.
BSTR FileName;

Thumbnails filenames template. String for "C" "printf" function with one integer field.
For example: "d:\filedir\filename%05d.jpg"
You must specify full pathname with proper extension. Frame number will be inserted instead of %d symbols to obtain file name for specific thumbnail.


ISceneDetector interface

This interface provides methods for scene detection:

DetectScenesInFile method

HRESULT _stdcall DetectScenesInFile([in] DetectorParameters *Params, [in] ThumbnailsParameters *ThumbnailsParams, [out] SAFEARRAY(long) *Scenes );

This is main scene detection method.
Pass detection parameters in Params and pointer to SAFEARRAY in Scenes.
If Scenes is NULL function does not return detected scenes list.
Caller does not need to create SafeArray before calling this method.
If Scenes is not NULL and after function call *Scenes is not NULL too caller must destroy *Scenes SafeArray.
*Scenes is two dimensional array. For each found scene there are two long values: scene start and stop frame numbers. These values are relative to processing start media position.
Pass non NULL pointer as ThumbnailsParams to turn on thumbnails generation and specify it's parameters. Pass NULL to disable thumbnail generation.

Returns an HRESULT value. Possible values include the following:

0 Success
0x80040601 Unspecified error
0x80040602 DirectShow is not installed or version is too old
0x80040603 Cannot obtain video duration. May be still image
0x80040604 Bad argument
0x80040605 Cannot create device context
0x80040606 Object is busy with another task
0x80040607 Framerate equals zero. May be still image
0x80040608 Cannot build graph


ISceneDetector2 interface

This interface derives from the ISceneDetector interface and provides extended methods for scene detection. In addition to the methods inherited from ISceneDetector, the ISceneDetector2 interface exposes the following methods:

DetectScenesInFile2 method

HRESULT _stdcall DetectScenesInFile2([in] DetectorParameters2 *Params, [in] ThumbnailsParameters *ThumbnailsParams, [out] SAFEARRAY(long) *Scenes );

Unlike DetectScenesInFile this method uses DetectorParameters2 set of parameters, so allows to specify color space, start and stop processing positions.
Pay attention to fact that scenes start and stop values returned by this method are relative to Params.StartPosition value.
See DetectScenesInFile description for other info.

GetFrameRate method

HRESULT _stdcall GetFrameRate([out, retval] double *pFrameRate );

Returns video frame rate of last processed file. Useful for time to frame number translation.
Returns S_OK or E_INVALIDARG if pFrameRate is NULL.


ISceneDetectorEvents events interface

This events interface provides methods for retrieving scene detection info in real time fashion.

NewScene method

HRESULT NewScene([in] long SceneIndex, [in] long Start, [in] long Stop );

Event fired by detector when new scene defined.
SceneIndex starts from 0.
Start and Stop also starts from 0 and are frame numbers.
When processing started via DetectScenesInFile2 method Start and Stop values are relative to Params.StartPosition value.

Status method

HRESULT Status([in] long ScenesFound, [in] long CurrentFrame, [in] long TotalFrames, [out] long *AbortProcess );

Event fired every 50 ms and informs client about process status.
AbortProcess is pointer to variable. Thus client can set *AbortProcess to 1 value to cancel current scene detection.