Paper Detail

OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer

Boyang Wang, Guangyi Xu, Zhipeng Tang, Jiahui Zhang, Zezhou Cheng

arxiv Score 6.3

Published 2026-04-27 · First seen 2026-04-28

General AI

Abstract

Shot Boundary Detection (SBD) aims to automatically identify shot changes and divide a video into coherent shots. While SBD was widely studied in the literature, existing state-of-the-art methods often produce non-interpretable boundaries on transitions, miss subtle yet harmful discontinuities, and rely on noisy, low-diversity annotations and outdated benchmarks. To alleviate these limitations, we propose OmniShotCut to formulate SBD as structured relational prediction, jointly estimating shot ranges with intra-shot relations and inter-shot relations, by a shot query-based dense video Transformer. To avoid imprecise manual labeling, we adopt a fully synthetic transition synthesis pipeline that automatically reproduces major transition families with precise boundaries and parameterized variants. We also introduce OmniShotCutBench, a modern wide-domain benchmark enabling holistic and diagnostic evaluation.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
soon
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{wang2026omnishotcut,
  title = {OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer},
  author = {Boyang Wang and Guangyi Xu and Zhipeng Tang and Jiahui Zhang and Zezhou Cheng},
  year = {2026},
  abstract = {Shot Boundary Detection (SBD) aims to automatically identify shot changes and divide a video into coherent shots. While SBD was widely studied in the literature, existing state-of-the-art methods often produce non-interpretable boundaries on transitions, miss subtle yet harmful discontinuities, and rely on noisy, low-diversity annotations and outdated benchmarks. To alleviate these limitations, we propose OmniShotCut to formulate SBD as structured relational prediction, jointly estimating shot r},
  url = {https://arxiv.org/abs/2604.24762},
  keywords = {cs.CV, shot boundary detection, structured relational prediction, intra-shot relations, inter-shot relations, shot query-based dense video Transformer, synthetic transition synthesis, OmniShotCutBench, wide-domain benchmark, code available, huggingface daily},
  eprint = {2604.24762},
  archiveprefix = {arXiv},
}

Metadata

{}