data_juicer.ops.filter.video_resolution_filter module#

class data_juicer.ops.filter.video_resolution_filter.VideoResolutionFilter(min_width: int = 1, max_width: int = 9223372036854775807, min_height: int = 1, max_height: int = 9223372036854775807, any_or_all: str = 'any', *args, **kwargs)[source]#

Bases: Filter

Keep data samples whose videosโ€™ resolutions are within a specified range.

__init__(min_width: int = 1, max_width: int = 9223372036854775807, min_height: int = 1, max_height: int = 9223372036854775807, any_or_all: str = 'any', *args, **kwargs)[source]#

Initialization method.

Parameters:
  • min_width โ€“ The min horizontal resolution.

  • max_width โ€“ The max horizontal resolution.

  • min_height โ€“ The min vertical resolution.

  • max_height โ€“ The max vertical resolution.

  • any_or_all โ€“ keep this sample with โ€˜anyโ€™ or โ€˜allโ€™ strategy of all videos. โ€˜anyโ€™: keep this sample if any videos meet the condition. โ€˜allโ€™: keep this sample only if all videos meet the condition.

  • args โ€“ extra args

  • kwargs โ€“ extra args

compute_stats_single(sample, context=False)[source]#

Compute stats for the sample which is used as a metric to decide whether to filter this sample.

Parameters:
  • sample โ€“ input sample.

  • context โ€“ whether to store context information of intermediate vars in the sample temporarily.

Returns:

sample with computed stats

process_single(sample)[source]#

For sample level, sample โ€“> Boolean.

Parameters:

sample โ€“ sample to decide whether to filter

Returns:

true for keeping and false for filtering