Paper Detail

Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

Mihir Prabhudesai, Aryan Satpathy, Yangmin Li, Zheyang Qin, Nikash Bhardwaj, Amir Zadeh, Chuan Li, Katerina Fragkiadaki, Deepak Pathak

Browse

Workflow Queues

arxiv Score 17.3

Published 2026-04-13 · First seen 2026-04-14

General AI

Open paper source

Abstract

We have witnessed remarkable advances in LLM reasoning capabilities with the advent of DeepSeek-R1. However, much of this progress has been fueled by the abundance of internet question-answer (QA) pairs, a major bottleneck going forward, since such data is limited in scale and concentrated mainly in domains like mathematics. In contrast, other sciences such as physics lack large-scale QA datasets to effectively train reasoning-capable models. In this work, we show that physics simulators can serve as a powerful alternative source of supervision for training LLMs for physical reasoning. We generate random scenes in physics engines, create synthetic question-answer pairs from simulated interactions, and train LLMs using reinforcement learning on this synthetic data. Our models exhibit zero-shot sim-to-real transfer to real-world physics benchmarks: for example, training solely on synthetic simulated data improves performance on IPhO (International Physics Olympiad) problems by 5-10 percentage points across model sizes. These results demonstrate that physics simulators can act as scalable data generators, enabling LLMs to acquire deep physical reasoning skills beyond the limitations of internet-scale QA data. Code available at: https://sim2reason.github.io/.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: now
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@article{prabhudesai2026solving,
  title = {Solving Physics Olympiad via Reinforcement Learning on Physics Simulators},
  author = {Mihir Prabhudesai and Aryan Satpathy and Yangmin Li and Zheyang Qin and Nikash Bhardwaj and Amir Zadeh and Chuan Li and Katerina Fragkiadaki and Deepak Pathak},
  year = {2026},
  abstract = {We have witnessed remarkable advances in LLM reasoning capabilities with the advent of DeepSeek-R1. However, much of this progress has been fueled by the abundance of internet question-answer (QA) pairs, a major bottleneck going forward, since such data is limited in scale and concentrated mainly in domains like mathematics. In contrast, other sciences such as physics lack large-scale QA datasets to effectively train reasoning-capable models. In this work, we show that physics simulators can ser},
  url = {https://arxiv.org/abs/2604.11805},
  keywords = {cs.LG, cs.AI, cs.CV, cs.RO, large language models, physics simulators, synthetic data, reinforcement learning, zero-shot transfer, IPhO, code available, huggingface daily},
  eprint = {2604.11805},
  archiveprefix = {arXiv},
}

Metadata

{}