Paper Detail

Selective Control under Noisy Perception: Governance Failures Hidden by Aggregate Metrics in Modular Networks

Igor Itkin

huggingface Score 5.5

Published 2026-06-12 · First seen 2026-06-16

General AI

Abstract

A content-moderation system can score well on every standard accuracy metric and still cause real harm, if its mistakes fall on the few users who connect otherwise separate communities. We show this in an agent-based model where N=240 learning agents on a community-structured network each post harmless, productive, or dangerous content, and a regulator removes or penalizes whatever a noisy classifier flags. Overall usefulness barely moves as the noise changes (one-way ANOVA, p=0.96): by aggregate measures, nothing looks wrong. The damage instead concentrates on these bridge users, whose useful posts are wrongly suppressed and whose dangerous posts are wrongly spared. A governance loss (L_gov) that prices these two mistakes separately from the cost of enforcement more than doubles under false-positive-heavy noise. Aggregate accuracy hides who is harmed, and the cheap quantity to audit is how many connections a user has (degree), a near-perfect proxy for the betweenness that defines a bridge (r=0.96).

Workflow Status

Review status
pending
Role
unreviewed
Read priority
later
Vote
Not set.
Saved
no
Collections
Not filed yet.
Next action
Not filled yet.

Reading Brief

No structured notes yet. Add `summary_sections`, `why_relevant`, `claim_impact`, or `next_action` in `papers.jsonl` to enrich this view.

Why It Surfaced

No ranking explanation is available yet.

Tags

No tags.

BibTeX

@misc{itkin2026selective,
  title = {Selective Control under Noisy Perception: Governance Failures Hidden by Aggregate Metrics in Modular Networks},
  author = {Igor Itkin},
  year = {2026},
  abstract = {A content-moderation system can score well on every standard accuracy metric and still cause real harm, if its mistakes fall on the few users who connect otherwise separate communities. We show this in an agent-based model where N=240 learning agents on a community-structured network each post harmless, productive, or dangerous content, and a regulator removes or penalizes whatever a noisy classifier flags. Overall usefulness barely moves as the noise changes (one-way ANOVA, p=0.96): by aggregat},
  url = {https://huggingface.co/papers/2606.14819},
  keywords = {agent-based model, community-structured network, noisy classifier, regulator, bridge users, governance loss, false-positive-heavy noise, betweenness, degree, code available, huggingface daily},
  eprint = {2606.14819},
  archiveprefix = {arXiv},
}

Metadata

{}