Two-Phase Command Architecture #407

Breakthrough · 2024-07-22T02:19:40Z

Problem/Use Case

With the addition of more detectors and filters, it would be ideal to improve algorithm reuse and interoperability. As identified in #402, it should be possible to remove AdaptiveDetector and flash suppression filter options by allowing users to specify two commands when detecting scenes: the scoring phase (how to calculate the difference between frames indicating how "different" it is), and the trigger phase (how we decide from the score that the next frame is a new scene).

Solutions

Add a new type of command called filter- which can be used as follows. First, what is equivalent to today's default with detect-content becomes:

 scenedetect -i video.mp4 detect-content filter-flash

detect-adaptive will also be replaced with a filter called filter-adapt which must be combined with another fast-cut detector. The default for that becomes:

 scenedetect -i video.mp4 detect-content filter-adapt

Proposed Implementation:

Remove:

detect-adaptive command
--filter-mode option from detect-content

Add:

filter-adapt command to perform adaptive filtering on whatever fast cut detector is specified (e.g. should work with both detect-content and detect-histogram)
filter-flash command to perform --filter-mode=suppress with whatever fast cut detector is specified

Default values for filters might need to be tuned depending on what detector is being used, but this is a tractable problem.

Open Questions

What API changes are required to support this?

Right now detectors provide locations of cuts and not scores directly, making filtering more difficult. In v0.6.4 a new filter type was added which can be integrated with detectors individually, but this is not scalable. It can be used to ship something for the CLI earlier, while working on how the API can reflect this change.

Today detectors produce frame numbers where cuts are found. Instead, they should produce a type (fast cut, fade) and score for each frame from 0.0-1.0 which indicates the confidence that the given frame is a cut. Filters could then operate on the result.

TODO: Make API examples.

The text was updated successfully, but these errors were encountered:

Breakthrough · 2024-09-16T01:34:36Z

Perhaps this should be an API only change and not affect the CLI.

API Sketch

Ideally we could have a concept of data sources (detectors) and filters (what scene manager accomplishes today). The result of filter application would be a set of events, e.g.:

I'll try to get a PR up for this eventually that demonstrates it better.

from enum import Enum, auto
from scenedetect import FrameTimecode
import typing as ty

import numpy as np


class Source:
    pass

##
## Sources
##

class Similarity(Source):

    # Similarity of current frame from previous. Normalized between 0.0 and 1.0.
    @property
    def amount(self) -> float:
        pass

    # Confidence of measurement.
    @property
    def confidence(self) -> ty.Optional[float]:
        return None

class Foreground(Source):

    # Map of foreground and background pixels in source image.
    #
    # Should be usable as a mask by setting foreground to 255 and background to 0.
    @property
    def mask(self) -> np.ndarray:
        pass

class Brightness(Source):
    # Estimated brightness for the frame normalized from 0.0 to 1.0.
    @property
    def amount(self) -> float:
        pass


##
## Events
##

@Enum
class EventType:
    MOTION_START = auto()
    MOTION_END = auto()
    FADE_IN = auto()
    FADE_OUT = auto()
    CUT = auto()
    DISSOLVE = auto()


class Event:

    @property
    def type(self) -> EventType:
        pass

    @property
    def timecode(self) -> FrameTimecode:
        pass

##
## Filters
##


class Filter:
    pass


class Motion(Filter):
    def filter(fg: Foreground) -> Event:
        pass

    def post_process() -> ty.Iterable[Event]:
        pass

class Cuts(Filter):
    def filter(similarity: Similarity) -> Event:
        pass

    def post_process() -> ty.Iterable[Event]:
        pass


class Fades(Filter):
    def filter(brightness: Brightness) -> Event:
        pass

    def post_process() -> ty.Iterable[Event]:
        pass


##
## Workflow Result
##

class Result:


    @property
    def events(self) -> ty.Iterable[Event]:
        pass

    def to_scenes(self) -> ty.Iterable[FrameTimecode]:
        pass


##
## Dispatcher
##

class Dispatcher:

    def __init__(self, pipelines: ty.Iterable[ty.Tuple[Source, Filter]]):
        self._pipelines = pipelines
        pass


    def run(self, video) -> Result:

        events = []
        for frame in video:
            for (source, filter) in self._pipelines:
                events += filter(source)


###
### Stubs
###

class HSL(Similarity):
    pass




##
## Usage
##

from scenedetect import open_video, split_video_ffmpeg

video = open_video("test.mp4")

dispatcher = Dispatcher((HSL(), Cuts()))
result = dispatcher.run(video)

# Helper functions for commonly used combinations:

def detect_shot_boundaries(
    video: VideoStream,
    methods: ty.Iterable[ty.Tuple[Source, Filter]],
    ...)

Breakthrough added the feature label Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two-Phase Command Architecture #407

Two-Phase Command Architecture #407

Breakthrough commented Jul 22, 2024 •

edited

Loading

Breakthrough commented Sep 16, 2024 •

edited

Loading

Two-Phase Command Architecture #407

Two-Phase Command Architecture #407

Comments

Breakthrough commented Jul 22, 2024 • edited Loading

Breakthrough commented Sep 16, 2024 • edited Loading

Breakthrough commented Jul 22, 2024 •

edited

Loading

Breakthrough commented Sep 16, 2024 •

edited

Loading