Paper Detail

SIGMA-ASL: Sensor-Integrated Multimodal Dataset for Sign Language Recognition

Xiaofang Xiao, Guangchao Li, Guangrong Zhao, Qi Lin, Wen Ma, Hongkai Wen, Yanxiang Wang, Yiran Shen

Browse

Workflow Queues

arxiv Score 8.8

Published 2026-05-07 · First seen 2026-05-09

General AI

Open paper source

Abstract

Automatic sign language recognition (SLR) has become a key enabler of inclusive human-computer interaction, fostering seamless communication between deaf individuals and hearing communities. Despite significant advances in multimodal learning, existing SLR research remains dominated by vision-based datasets, which are limited by sensitivity to lighting and occlusion, privacy concerns, and a lack of cross-modal diversity. To address these challenges, we introduce SIGMA-ASL, a large-scale multimodal dataset for SLR. The dataset integrates an Azure Kinect RGB-D camera, a millimeter-wave (mmWave) radar, and two wrist-worn inertial measurement units (IMUs) to capture complementary visual, radio-reflection, and kinematic information. Collected in a controlled studio environment with 20 participants performing 160 common American sign language (ASL) signs, SIGMA-ASL provides 93,545 temporally synchronized word-level multimodal clips. A unified sensing framework achieves millisecond-level alignment across modalities, enabling reliable sensor fusion and cross-modal learning. We further design standardized preprocessing pipelines and benchmarking protocols under both user-dependent and user-independent settings, offering a comprehensive foundation for evaluating single and multimodal SLR. Extensive experiments validate the dataset's quality and demonstrate its potential as a valuable resource for developing robust, privacy-preserving, and ubiquitous sign language recognition systems.

Workflow Status

Review status: pending
Role: unreviewed
Read priority: soon
Vote: Not set.
Saved: no
Collections: Not filed yet.
Next action: Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

BibTeX

@article{xiao2026sigma,
  title = {SIGMA-ASL: Sensor-Integrated Multimodal Dataset for Sign Language Recognition},
  author = {Xiaofang Xiao and Guangchao Li and Guangrong Zhao and Qi Lin and Wen Ma and Hongkai Wen and Yanxiang Wang and Yiran Shen},
  year = {2026},
  abstract = {Automatic sign language recognition (SLR) has become a key enabler of inclusive human-computer interaction, fostering seamless communication between deaf individuals and hearing communities. Despite significant advances in multimodal learning, existing SLR research remains dominated by vision-based datasets, which are limited by sensitivity to lighting and occlusion, privacy concerns, and a lack of cross-modal diversity. To address these challenges, we introduce SIGMA-ASL, a large-scale multimod},
  url = {https://arxiv.org/abs/2605.06351},
  keywords = {cs.HC},
  eprint = {2605.06351},
  archiveprefix = {arXiv},
}

Metadata

{}