Paper Detail

Seeing Fast and Slow: Learning the Flow of Time in Videos

Yen-Siang Wu, Rundong Luo, Jingsen Zhu, Tao Tu, Ali Farhadi, Matthew Wallingford, Yu-Chiang Frank Wang, Steve Marschner, Wei-Chiu Ma

arxiv Score 10.3

Published 2026-04-23 · First seen 2026-04-24

General AI

Abstract

How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a learnable visual concept and develop models for reasoning about and manipulating the flow of time in videos. We first exploit the multimodal cues and temporal structure naturally present in videos to learn, in a self-supervised manner, to detect speed changes and estimate playback speed. We then show that these learned temporal reasoning models enable us to curate the largest slow-motion video dataset to date from noisy in-the-wild sources. Such slow-motion footage, typically filmed by high-speed cameras, contains substantially richer temporal detail than standard videos. Using this data, we further develop models capable of temporal control, including speed-conditioned video generation, which produces motion at specified playback speed, and temporal super-resolution, which tranforms low-FPS, blurry videos into high-FPS sequences with fine-grained temporal details. Our findings highlight time as a manipulable, perceptual dimension in video learning, opening doors to temporally controllable video generation, temporal forensics detection, and potentially richer world-models that understand how events unfold over time.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
now
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@article{wu2026seeing,
  title = {Seeing Fast and Slow: Learning the Flow of Time in Videos},
  author = {Yen-Siang Wu and Rundong Luo and Jingsen Zhu and Tao Tu and Ali Farhadi and Matthew Wallingford and Yu-Chiang Frank Wang and Steve Marschner and Wei-Chiu Ma},
  year = {2026},
  abstract = {How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a learnable visual concept and develop models for reasoning about and manipulating the flow of time in videos. We first exploit the multimodal cues and temporal structure naturally present in videos to l},
  url = {https://arxiv.org/abs/2604.21931},
  keywords = {cs.CV, cs.AI, cs.GR, temporal reasoning, self-supervised learning, speed detection, playback speed estimation, temporal control, video generation, temporal super-resolution, slow-motion video dataset, high-speed cameras, temporal forensics, huggingface daily, Computer science, Artificial intelligence, Exploit, Computer vision, Perception},
  eprint = {2604.21931},
  archiveprefix = {arXiv},
}

Metadata

{}