Paper Detail

VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

Juan Rodriguez, Haotian Zhang, Abhay Puri, Tianyang Zhang, Rishav Pramanik, Meng Lin, Xiaoqing Xie, Marco Terral, Darsh Kaushik, Aly Shariff, Perouz Taslakian, Spandana Gella, Sai Rajeswar, David Vazquez, Christopher Pal, Marco Pedersoli

huggingface Score 9.0

Published 2026-02-22 · First seen 2026-04-01

General AI

Abstract

We introduce VectorGym, a comprehensive benchmark suite for Scalable Vector Graphics (SVG) that spans generation from text and sketches, complex editing, and visual understanding. VectorGym addresses the lack of realistic, challenging benchmarks aligned with professional design workflows. Our benchmark comprises four tasks with expert human-authored annotations: the novel Sketch2SVG task (VG-Sketch); a new SVG editing dataset (VG-Edit) featuring complex, multi-step edits with higher-order primitives; Text2SVG generation (VG-Text); and SVG captioning (VG-Cap). Unlike prior benchmarks that rely on synthetic edits, VectorGym provides gold-standard human annotations that require semantic understanding and design intent. We also propose a multi-task reinforcement learning approach that jointly optimizes across all four tasks using rendering-based rewards. Our method, built on GRPO with curriculum learning, trains a Qwen3-VL 8B model that achieves state-of-the-art performance among open-source models, surpassing much larger models including Qwen3-VL 235B and matching GPT-4o. We also introduce a VLM-as-a-Judge metric for SVG generation, validated through human correlation studies. Our evaluation of frontier VLMs reveals significant performance gaps, positioning VectorGym as a rigorous framework for advancing visual code generation. VectorGym is publicly available on huggingface.co/datasets/ServiceNow/VectorGym.

Workflow Status

Review status
pending
Role
unreviewed
Read priority
soon
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{rodriguez2026vectorgym,
  title = {VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing},
  author = {Juan Rodriguez and Haotian Zhang and Abhay Puri and Tianyang Zhang and Rishav Pramanik and Meng Lin and Xiaoqing Xie and Marco Terral and Darsh Kaushik and Aly Shariff and Perouz Taslakian and Spandana Gella and Sai Rajeswar and David Vazquez and Christopher Pal and Marco Pedersoli},
  year = {2026},
  abstract = {We introduce VectorGym, a comprehensive benchmark suite for Scalable Vector Graphics (SVG) that spans generation from text and sketches, complex editing, and visual understanding. VectorGym addresses the lack of realistic, challenging benchmarks aligned with professional design workflows. Our benchmark comprises four tasks with expert human-authored annotations: the novel Sketch2SVG task (VG-Sketch); a new SVG editing dataset (VG-Edit) featuring complex, multi-step edits with higher-order primit},
  url = {https://huggingface.co/papers/2603.29852},
  keywords = {Scalable Vector Graphics, Sketch2SVG, SVG editing, Text2SVG, SVG captioning, multi-task reinforcement learning, GRPO, curriculum learning, VLM-as-a-Judge, visual code generation, huggingface daily},
  eprint = {2603.29852},
  archiveprefix = {arXiv},
}

Metadata

{}