Paper Detail

Implicit Preference Alignment for Human Image Animation

Yuanzhi Wang, Xuhua Ren, Jiaxiang Cheng, Bing Ma, Kai Yu, Tianxiang Zheng, Qinglin Lu, Zhen Cui

huggingface Score 6.0

Published 2026-05-08 · First seen 2026-05-13

General AI

Abstract

Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we propose Implicit Preference Alignment (IPA), a data-efficient post-training framework that eliminates the need for paired preference data. Theoretically grounded in implicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce a Hand-Aware Local Optimization mechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA

Workflow Status

Review status
pending
Role
unreviewed
Read priority
later
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{wang2026implicit,
  title = {Implicit Preference Alignment for Human Image Animation},
  author = {Yuanzhi Wang and Xuhua Ren and Jiaxiang Cheng and Bing Ma and Kai Yu and Tianxiang Zheng and Qinglin Lu and Zhen Cui},
  year = {2026},
  abstract = {Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise},
  url = {https://huggingface.co/papers/2605.07545},
  keywords = {reinforcement learning from human feedback, direct preference optimization, implicit reward maximization, preference pairs, hand motion generation, post-training framework, hand-aware local optimization, implicit preference alignment, code available, huggingface daily},
  eprint = {2605.07545},
  archiveprefix = {arXiv},
}

Metadata

{}