Paper Detail

LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing

Xinyu Wang, Chongbo Zhao, Fangneng Zhan, Yue Ma

huggingface Score 10.4

Published 2026-06-25 · First seen 2026-06-30

General AI

Abstract

Streaming video editing has made rapid progress, yet practical deployment is still limited by two core issues: maintaining stable backgrounds and non-edited regions over time, and achieving the low latency required for real-time interactive scenarios. Meanwhile, recent streaming video generation methods are mostly developed for synthesis and cannot be directly applied to editing due to the strict preservation requirement and region-specific control. In this work, we present a novel streaming video editing framework that performs causal, frame-by-frame editing with strong content preservation and real-time responsiveness. Our key design is a three-stage distillation pipeline that progressively transfers editing capability from a powerful bidirectional foundation model to an efficient unidirectional streaming editor, enabling stable long-horizon edits without sacrificing visual fidelity. To further support real-time deployment, we introduce an AR-oriented mask cache that reuses region-related computation across frames, substantially reducing redundant processing and accelerating inference. Finally, we establish a dedicated benchmark for streaming video editing. Extensive evaluations demonstrate that our method achieves state-of-the-art visual quality among streaming baselines while drastically boosting inference speed to 12.66 FPS, making it suitable for interactive and augmented reality applications.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
soon
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{wang2026liveedit,
  title = {LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing},
  author = {Xinyu Wang and Chongbo Zhao and Fangneng Zhan and Yue Ma},
  year = {2026},
  abstract = {Streaming video editing has made rapid progress, yet practical deployment is still limited by two core issues: maintaining stable backgrounds and non-edited regions over time, and achieving the low latency required for real-time interactive scenarios. Meanwhile, recent streaming video generation methods are mostly developed for synthesis and cannot be directly applied to editing due to the strict preservation requirement and region-specific control. In this work, we present a novel streaming vid},
  url = {https://huggingface.co/papers/2606.26740},
  keywords = {streaming video editing, causal editing, frame-by-frame editing, content preservation, real-time responsiveness, three-stage distillation pipeline, bidirectional foundation model, unidirectional streaming editor, long-horizon edits, AR-oriented mask cache, inference speed, interactive applications, augmented reality, code available, huggingface daily},
  eprint = {2606.26740},
  archiveprefix = {arXiv},
}

Metadata

{}