data_juicer.ops.deduplicator.ray_video_deduplicator module¶
- class data_juicer.ops.deduplicator.ray_video_deduplicator.RayVideoDeduplicator(backend: str = 'ray_actor', redis_address: str = 'redis://localhost:6379', *args, **kwargs)[source]¶
Bases:
RayBasicDeduplicator
Deduplicator to deduplicate samples at document-level using exact matching of videos between documents.