Daily - 2026-06-17

arxiv Score 22.5

Beyond Domains: Reusing Web Skills via Transferable Interaction Patterns

2026-06-16 · Shiqi He, Yue Cui, Feijie Wu, Xinyu Ma, Jiaheng Lu, Yaliang Li, Bolin Ding, Mosharaf Chowdhury

Research Track B · General AI

Large language model (LLM) web agents are usually deployed as tool callers: each turn, the model reads a fresh page observation and emits one structured tool action. When every action is a low-level primitive, horizons grow quickly and so do policy-facing LLM completions, dominating latency and cost on benchmarks such …

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 17.5

LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

2026-06-16 · Jian Yang, Shawn Guo, Wei Zhang, Tianyu Zheng, Yaxin Du, Haau-Sing Li, Jiajun Wu, Yue Song, Yan Xing, Qingsong Cai, Zelong Huang, Chuan Hao, Ran Tao, Xianglong Liu, Wayne Xin Zhao, Mingjie Tang, Weifeng Lv, Ming Zhou, Bryan Dai

General AI

Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count. Parallel loop Transformers (PLT) alleviate this cost through cross-loop position offsets (CLP) and shared-KV gated sliding-window attention, making loop co…

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 15.5

OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation

2026-06-16 · Guibin Zhang, Xun Xu, Yanwei Yue, Zikun Su, Wangchunshu Zhou, Xiaobin Hu, Shuicheng Yan

Research Track A · General AI

Memory has become a standard substrate for self-evolving agents, yet retaining experience is not the same as learning how to evolve through it. Existing memory agents can store trajectories, retrieve reflections, or accumulate skills, but often lack the holistic competence to select useful experience, act on it, write …

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 15.3

Learning from the Self-future: On-policy Self-distillation for dLLMs

2026-06-16 · Yifu Luo, Zeyu Chen, Haoyu Wang, Xinhao Hu, Yuxuan Zhang, Zhizhou Sha, Shiwei Liu

General AI

On-policy self-distillation (OPSD) has proven effective for post-training large language models (LLMs), yet its application to diffusion LLMs (dLLMs) remains unexplored. Existing OPSD methods are inherently autoregressive-centric. They inject privileged information via left-to-right prefix conditioning with token-level…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 15.3

The Measurement Gap in the Automation of EU Law: Benchmarking Doctrinal Legal Reasoning under the EU AI Act

2026-06-16 · Michèle Finck

General AI

Large language models now produce legal text of at least median quality, yet no existing benchmark can evaluate whether they perform doctrinal legal reasoning, which forms the interpretive core of legal work, rather than the ancillary, paralegal tasks that most current legal-AI evaluations measure. This measurement gap…

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 14.5

ChLogic: Evaluating Robustness of Logical Reasoning in Chinese Expressions

2026-06-16 · Peixian Zhou, Yuxu Chen, Chaorui Zhang, Wei Han, Bo Bai, Xueyan Niu

General AI

Large language models perform increasingly well on standardized logical reasoning benchmarks, but whether this ability remains robust beyond English is unclear. We introduce ChLogic, an English--Chinese aligned benchmark that tests whether models preserve logical reasoning performance when the same latent logical struc…

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 14.5

GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?

2026-06-16 · Tongxu Luo, Rongsheng Wang, Jiaxi Bi, Chenming Xu, Zhengyang Tang, Jianlong Chen, Juhao Liang, Ke Ji, Shuqi Guo, Yuhao Du, Fan Bu, Wenyu Du, Xiaotong Zhang, Kyle Li, Shaobo Wang, Linfeng Zhang, Yuxuan Liu, Xin Lai, Chenxin Li, Yiduo Guo, Zhexin Zhang, Xinyuan Wang, Tianyi Bai, Ziniu Li, Benyou Wang

General AI

Game generation is an emerging application of coding agents, requiring models to transform natural-language specifications into playable interactive systems. Unlike traditional coding tasks, game generation takes place within a game engine, where scripts, scenes, assets, rendering, and runtime interactions must jointly…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 14.3

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

2026-06-16 · Ahmed Ryan, Saad Sakib Noor, Md Erfan, Shaswata Mitra, Sudip Mittal, Md Rayhanur Rahman

General AI

Classifying Cyber Threat Intelligence (CTI) using MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) is essential for proactive defense, but historically required extensive human effort. Pre-Large Language Model (LLM) automation sped up this process, but could not resolve the complex language and mult…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 14.3

The Stanford EDGAR Filings Dataset: Reconstructing U.S. Corporate and Financial Disclosures into Layout-Faithful and Token-Efficient Pretraining Data

2026-06-16 · Nick Bettencourt, Xiaowei Ding, Kay Giesecke

General AI

As high-quality public web corpora become increasingly exhausted, clean long-context documents have become a scarce and expensive source of training data for large language models (LLMs). Existing long-context corpora are often proprietary and costly to acquire, synthetically generated, or concentrated in narrow domain…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 14.3

Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification

2026-06-16 · Wujian Peng, Lingchen Meng, Yuxuan Cai, Xianwei Zhuang, Yuhuan Yang, Rongyao Fang, Chenfei Wu, Junyang Lin, Zuxuan Wu, Shuai Bai

General AI

Unified Multimodal Modeling aims to integrate visual understanding and generation within a single system. However, existing approaches typically rely on two disparate visual tokenizers, which splits the representation space and hinders truly unified modeling. We propose UniAR, a unified autoregressive framework where a…

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 13.5

Beyond Monolingual Deep Research: Evaluating Agents and Retrievers with Cross-Lingual BrowseComp-Plus

2026-06-13 · Yuheng Lu, Qingcheng Zeng, Heli Qi, Puxuan Yu, Fuheng Zhao, Rui Yang, Hitomi Yanaka, Naoto Yokoya, Weihao Xuan

General AI

Deep research agents are increasingly evaluated on their ability to search for evidence, reason over retrieved sources, and produce grounded answers. Existing browsing benchmarks, however, largely assume that the user's query and the supporting evidence are written in the same language, leaving open whether agentic sea…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 13.3

EventDrive: Event Cameras for Vision-Language Driving Intelligence

2026-06-16 · Dongyue Lu, Rong Li, Ao Liang, Lingdong Kong, Wei Yin, Lai Xing Ng, Benoit R. Cottereau, Camille Simon Chane, Wei Tsang Ooi

General AI

Event cameras sense the world through asynchronous brightness changes with microsecond latency and high dynamic range, offering motion fidelity far beyond frame-based sensors and capturing temporal structure that conventional exposures often miss. These properties make events a powerful complement to RGB in autonomous …

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 13.3

EvolveNav: Proactive Preflection and Self-Evolving Memory for Zero-Shot Object Goal Navigation

2026-06-16 · Qi Chai, Wenhao Shen, Nanjie Yao, Yue Xia, Kaiyong Zhao, Jie Ma, Guosheng Lin, Hao Wang

General AI

Zero-Shot Object-Goal Navigation (ZS-OGN) requires embodied agents to explore and locate target objects without any prior training. To this end, recent methods leverage foundation models. But they typically rely on static priors and lack adaptation, which leads to repeated errors and costly trial and error. In this pap…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 13.3

Learning Cardiac Electrophysiology Digital Twins Through Agentic Discovery of Hybrid Structure

2026-06-16 · Ziqi Zhou, Yubo Ye, Sumeet Atul Vadhavka, Linwei Wang, Zhiqiang Tao

General AI

Building personalized cardiac electrophysiology (EP) digital twins requires identifying the appropriate model structure for each patient, not merely fitting parameters. Traditional methods rely on experts to manually prescribe hybrid physics-neural architectures, which requires deep domain expertise and does not transf…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 12.3

A Red-Team Study of Anthropic Fable 5 & Opus 4.8 Models

2026-06-16 · Nicola Franco

General AI

We evaluate the adversarial robustness of two frontier large language models (LLMs) developed by Anthropic, Fable 5 and Opus 4.8, against four families of automated jailbreak attack across 7 826 harmful intents spanning a ten-category harm taxonomy. Using the HackAgent red-teaming framework, hundreds of thousands of ad…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 12.3

MLLMs Get It Right, Then Get It Wrong: Tracing and Correcting Late-Layer Textual Bias

2026-06-16 · Xingming Li, Ao Cheng, Qiyao Sun, Xixiang He, Xuanyu Ji, Runke Huang, Qingyong Hu

General AI

When vision contradicts text, multimodal large language models (MLLMs) consistently favor text, even when images provide clear evidence otherwise. This bias poses risks for applications requiring visual grounding, yet its cause remains unclear. In this paper, we uncover a surprising finding: models often get it right i…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 12.3

ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues

2026-06-16 · Shanda Li, Qiuhong Anna Wei, Jingwu Tang, Valerie Chen, Nihar B Shah, Tim Dettmers, Yiming Yang, Ameet Talwalkar

General AI

Reproducing research results from papers and released code is central to scientific progress. Existing works have introduced benchmarks to evaluate whether LLM agents can assist with reproducibility, but they are difficult to scale due to their reliance on substantial manual effort for data curation and evaluation. We …

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 11.5

A Gradient Perspective on RLVR Stability and Winner Advantage Policy Optimization

2026-06-15 · Prasanth YSS, Zhichen Ren, Rasa Hosseinzadeh, Ilan Gofman, Yuqi Chen, Zhaoyan Liu, Guangwei Yu, Jesse C. Cresswell, Satya Krishna Gorti

General AI

Reinforcement learning with verifiable rewards (RLVR) improves language-model reasoning, but GRPO-style optimization remains prone to collapse. We analyse this instability through token-level gradient dynamics, deriving a taxonomy that predicts how updates affect next-token probabilities and entropy. The taxonomy shows…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 11.5

Dimensionality Controls When Modularity Helps in Continual Learning

2026-06-16 · Kathrin Korte, Christian Medeiros Adriano, Joachim Winther Pedersen, Eleni Nisioti, Sebastian Risi

Research Track A

Compositional learning systems must balance plasticity, the ability to acquire new knowledge, with stability, the preservation of previously learned components, especially when tasks share structure and risk interference. We study how modular architecture, task similarity, and representational dimensionality jointly sh…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 11.3

Learning Red Agent Policy from Observations for Neurosymbolic Autonomous Cyber Agents

2026-06-16 · Ankita Samaddar, Sandeep Neema, Daniel Balasubramanian, Xenofon Koutsoukos

General AI

With sophisticated cyber-attacks becoming increasingly prevalent, modern networks require intelligent autonomous cyber-defense agents trained via Reinforcement Learning (RL). These agents employ neurosymbolic approaches such as behavior trees with learning-enabled components (LECs) to learn, reason, adapt, and implemen…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 11.3

Plug-and-Adapt: Multimodal Coreference Resolution at First Sight with a Pretrained Alignment Model

2026-06-16 · Jinghan Wu, Jing Li, Ivor W. Tsang, Xuetao Zhang

General AI

Visual information helps resolve ambiguity in coreference resolution, leading to notable performance gains. However, existing Multi-modal Coreference Resolution (MCR) methods require training with (partially) annotated data from the target dataset before they can be applied, preventing their direct usability and raisin…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 11.3

RubricsTree: Scalable and Evolving Open-Ended Evaluation of Personal Health Agents across Health Memory and Medical Skills

2026-06-16 · Weizhi Zhang, Zechen Li, Hamid Palangi, Ben Graef, A. Ali Heydari, Simon A. Lee, Salman Rahman, Ray Luo, Zeinab Esmaeilpour, Erik Schenck, Chloe Zhang, Yamin Li, Menglian Zhou, Philip S. Yu, Daniel McDuff, Lindsey Sunden, Mark Malhotra, Shwetak Patel, Ahmed A. Metwally

General AI

The LLM-empowered personal health agents with user health (sensor) metrics have offered a promising pathway to alleviate global disparities in healthcare access. However, large-scale clinical deployment remains constrained by an open-ended evaluation bottleneck: physician annotation is reliable but costly and unscalabl…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 10.5

When Robots Sleep: Offline Skill Consolidation for Shared-Policy Robot Learning

2026-06-16 · Nethmi Jayasinghe, Diana Gontero, Amit Ranjan Trivedi

Research Track A · General AI

Robots that learn over long deployments must add new skills without losing the shared policy structure that makes earlier skills reusable. We study sequential robot skill learning, where previous trajectories and task losses may be unavailable, and the deployed policy must remain a single shared controller without task…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 10.3

DRFLOW: A Deep Research Benchmark for Personalized Workflow Prediction

2026-06-16 · Md Tawkat Islam Khondaker, Raymond Li, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Issam H. Laradji

General AI

Deep research (DR) systems are increasingly used for complex information-seeking tasks, but existing works mainly focus on generating reports and summaries. In contrast, many enterprise tasks instead require an agent to identify concrete workflows which is a sequence of action-steps. For example, rather than summarizin…

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 9.5

ActWorld: From Explorable to Interactive World Model via Action-Aware Memory

2026-06-16 · Zhexiao Xiong, Yizhi Song, Hao Kang, Qing Yan, Liming Jiang, Jenson Yang, Zhoujie Fu, Stathi Fotiadis, Angtian Wang, Zichuan Liu, Bo Liu, Yiding Yang, Xin Lu, Nathan Jacobs

Research Track A · General AI

Interactive world models aim to simulate environment dynamics under real-time user actions. However, their action vocabulary is largely confined to navigation: most actions correspond to motion (e.g., walk, turn, look around), while interaction with objects in the scene (e.g., pick up plates, open doors, or trigger phy…

Review: pending
Role: unreviewed
Read: now

Open source Details

huggingface Score 9.5

Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients

2026-06-16 · Byung-Kwan Lee, Ximing Lu, Shizhe Diao, Minki Kang, Saurav Muralidharan, Karan Sapra, Andrew Tao, Pavlo Molchanov, Yejin Choi, Yu-Chiang Frank Wang, Ryo Hachiuma

General AI

Knowledge distillation transfers a teacher's competence to a small student but is brittle in the small-student regime: forcing the student to imitate logits from a much larger teacher concentrates it on the teacher's sharpest modes, hurting generalization on benchmark families beyond the training corpus. Reinforcement …

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 9.3

Seeing Is Not Screening: Multimodal Hidden Instruction Attacks on Agent Skill Scanners

2026-06-16 · Xiaojun Jia, Jie Liao, Simeng Qin, Ke Ma, Wenbo Guo, Yebo Feng, Aishan Liu, Yang Liu

General AI

Agent skills are emerging as an important attack surface in LLM-based systems. Through an empirical study of existing skill scanners, we find that current defenses primarily rely on textual descriptions, manifests, and source code as the main signals for security analysis, which can leave visually conveyed malicious in…

Review: pending
Role: unreviewed
Read: now

Open source Details

arxiv Score 8.3

Fixed-Point Reasoners: Stable and Adaptive Deep Looped Transformers

2026-06-16 · Sajad Movahedi, Vera Milovanović, Shlomo Libo Feigin, Alexander Theus, Thomas Hofmann, Valentina Boeva, T. Konstantin Rusch, Antonio Orvieto

General AI

Looped architectures provide an inductive bias toward learning step-by-step procedures for tasks that require compositional reasoning. The number of effective layers reached by looping determines the quality of the solution these models find. Like deep architectures, looped architectures are prone to a signal propagati…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 8.3

Predicting Immune Biomarkers with MultiModal Mixture-of-Expert Pathology Foundation Models Empowers Precision Oncology

2026-06-16 · Tianyu Liu, Ziqing Wang, Zhaokang Liang, Tong Ding, Peter Humphrey, Lorraine Colón-Cartagena, Emily Ling-Lin Pai, Kenneth Tou En Chang, Mohamed Kahila, Jonathan Chong Kai Liew, Tinglin Huang, Rex Ying, Kaize Ding, Faisal Mahmood, Wengong Jin

General AI

Predicting immune biomarkers associated with the tumor immune microenvironment (TIME) is critical for advancing precision oncology, yet existing approaches are largely limited to single image modalities and suffer from insufficient resolution and incomplete utilization of complementary clinical and biological informati…

Review: pending
Role: unreviewed
Read: soon

Open source Details

huggingface Score 7.5

RepSelect: Robust LLM Unlearning via Representation Selectivity

2026-06-15 · Filip Sondej, Yushi Yang, Adam Mahdi

General AI

Making large language models (LLMs) deeply forget specific knowledge and values without sacrificing general capabilities remains a central challenge in unlearning. However, current methods are easily reversed by fine-tuning or few-shot prompting, suggesting their forgetting is only shallow. We identify the root cause. …

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 7.5

UoU: A Universal Fingerprint Foundation Model Based on Large-Scale Unsupervised Learning

2026-06-16 · Xiongjun Guan, Jianjiang Feng, Jie Zhou

Research Track A

Fingerprint recognition is still dominated by task-specific pipelines, where enhancement, structural parsing, alignment, and matching are optimized in isolation. Although effective in narrow settings, this design limits representation reuse across sensors, qualities, and downstream applications. We therefore present Uo…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 7.3

Multi-Source Cybersecurity Logs: An ATT&CK-Labeled Dataset and SLM Evaluation

2026-06-16 · Abir Ashab Niloy, Ahmed Ryan, Imamul Hossain Rafi, Md Erfan, Md Rayhanur Rahman

General AI

Multi-stage cyberattacks span system, network, and browser logs. Detecting them requires correlating events across all three sources. Machine learning methods can learn these cross-source patterns, but they need labeled multi-source data. Existing public datasets fall short. Network-only datasets such as CICIDS and UNS…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 7.3

One-Step Token-to-Waveform Generation with MeanFlow in Latent Space

2026-06-16 · Zheqi Dai, Guangyan Zhang, Zhen Ye, Jingyu Li, Haolin He, Chunyat Wu, Yiwen Guo, Qiuqiang Kong

General AI

Neural audio codecs are central to modern LLM-based Text-to-Speech (TTS) and multimodal systems. As low-bitrate semantic codecs gain prominence, the Token-to-Waveform (Token2Wav) decoder becomes a bottleneck determining both perceptual quality and system efficiency. Conventional multi-step flow-matching decoders offer …

Review: pending
Role: unreviewed
Read: soon

Open source Details

huggingface Score 7.2

ProCUA-SFT Technical Report

2026-06-15 · Jaehun Jung, Ximing Lu, Brandon Cui, Muhammad Khalifa, Shaokun Zhang, Hao Zhang, Jin Xu, Amala Sanjay Deshmukh, Karan Sapra, Andrew Tao, Yejin Choi, Jan Kautz, Mingjie Liu, Yi Dong

Research Track B · General AI

Training computer-use agents (CUAs) -- models that interact with graphical desktops through screenshots and keyboard/mouse actions -- requires large-scale, diverse trajectory data collected in full desktop environments. The largest public resource, AgentNet (22.5K human trajectories), leads to negative transfer when us…

Review: pending
Role: unreviewed
Read: soon

Open source Details

huggingface Score 6.5

Rethinking the Role of Efficient Attention in Hybrid Architectures

2026-06-13 · Ziqing Qiao, Yinuo Xu, Chaojun Xiao, Zhou Su, Zihan Zhou, Yingfa Chen, Xiaoyue Xu, Xu Han, Zhiyuan Liu

General AI

Modern language models increasingly adopt hybrid architectures that combine full attention with efficient attention modules, such as sliding-window attention (SWA) and recurrent sequence mixers. However, how these efficient modules shape model capabilities remains poorly understood. To address this gap, we conduct a sy…

Review: pending
Role: unreviewed
Read: soon

Open source Details

huggingface Score 6.5

Text-Vision Co-Instructed Image Editing

2026-06-15 · Chenxi Xie, Yuhui Wu, Qiaosi Yi, Lei Zhang

General AI

Existing image editing methods can be generally categorized into textual instruction-based and visual prompt-based ones. Textual instructions are semantically expressive, but are limited by the coarse granularity of spatial control of the editing results. In contrast, visual prompts such as drag and point can provide p…

Review: pending
Role: unreviewed
Read: soon

Open source Details

huggingface Score 6.5

Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion

2026-06-16 · Weichen Fan, Haiwen Diao, Penghao Wu, Ziwei Liu

General AI

Pixel-space diffusion models are trained on full-bandwidth noisy images, yet the useful signal available to the denoiser is strongly frequency dependent. Under rectified-flow diffusion and natural-image power-law spectra, the per-band data-to-noise contour k^{*}(t) = (1-t)^{-2/α} separates a signal-bearing low-frequenc…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 6.3

Adaptive Volumetric Mechanical Property Fields Invariant to Resolution

2026-06-16 · Rishit Dagli, Donglai Xiang, Vismay Modi, Xuning Yang, Gavriel State, David I. W. Levin, Maria Shugrina

General AI

Accurate mechanical properties (or materials) Young's modulus ($E$), Poisson's ratio ($ν$) and density ($ρ$) are essential for reliable physics simulation of digital worlds, but most 3D assets lack this information. We propose AdaVoMP, a method for predicting accurate dense spatially-varying ($E$, $ν$, $ρ$) for input 3…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 6.3

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

2026-06-16 · Ning Gao, Jinliang Zheng, Xing Gao, Haoxiang Ma, Hanqing Wang, Yukai Wang, Jiantong Chen, Zanxin Chen, Shujie Zhang, Mingda Jia, Xuekun Jiang, Zihou Zhu, Xinyu Li, Shuai Wang, Hao Li, Wenzhe Cai, Yuqiang Yang, Xudong Xu, Zhaoyang Lyu, Yao Mu, Tai Wang, Jiangmiao Pang, Jia Zeng, Weinan Zhang, Chunhua Shen

General AI

We present EBench, a simulation benchmark that diagnoses generalist mobile manipulation policies beyond a single success-rate scalar. EBench comprises 26 diverse and challenging manipulation tasks annotated along 5 capability dimensions and 4 generalization dimensions. We evaluate state-of-the-art generalist manipulati…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 6.3

MAJIC: Leveraging Articulatory Motion for Speech-based Emotion Recognition

2026-06-16 · Tanmay Srivastava, Paras Bhavnani, Benjir Alvee Islam, Shubham Jain

General AI

We introduce MAJIC, a multimodal emotion recognition system that leverages articulatory motion of the jaw and facial muscles for speech-based emotion recognition (SER). While most SER systems perform well on datasets with strongly expressed emotional speech of trained actors, their performance often degrades when emoti…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 5.3

Darshana Graph: A Parallel Commentary Corpus for Comparative Indian Philosophy, with Stylometric and Exploratory Graph Analyses

2026-06-16 · Joy Bose

General AI

We introduce Darshana Graph, a corpus of over 125,000 text records spanning classical Hindu, Buddhist, and Jain philosophical traditions, drawn from public-domain and openly licensed translations of sources including the Bhagavad Gita, Brahma Sutras, principal Upanishads, the Pali Canon, and core Jain texts. Its distin…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 5.3

Future Dynamic 3D Reconstruction: A 3D World Model with Disentangled Ego-Motion

2026-06-16 · Nils Morbitzer, Jonathan Evers, Artem Savkin, Thomas Stauner, Nassir Navab, Federico Tombari, Stefano Gasperini

General AI

Forecasting the evolution of dynamic environments is crucial for autonomous agents. While generative world models have recently achieved high photorealism in 2D video synthesis by mixing ego-motion and environmental dynamics within the image plane, they exhibit physical inconsistencies, such as morphing or vanishing ob…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 5.3

IUU+DB: Tracking Illegal, Unreported, and Unregulated Fishing, Seafood Fraud, and Labor Abuse through LLM-driven Information Extraction

2026-06-16 · Henry Bodwell, Hong Yang, John C. Simeone, Kelvin Gorospe, Bella Sullivan, Lana Huang, Jessica Gephart, Sandy Aylesworth, Molly Masterton, Naren Ramakrishnan

General AI

Illegal, unreported, and unregulated fishing (IUU) traditionally refers to fishing activities that violate applicable laws or occur in areas that lack applicable laws. We propose the term IUU+ to capture a broader suite of fisheries sector environmental and associated supply chain trade-related crimes and behaviors. Al…

Review: pending
Role: unreviewed
Read: soon

Open source Details

arxiv Score 5.3

MOCHI: Motion Enhancement of Collaborative Human-object Interactions

2026-06-16 · Jiye Lee, Yonghun Choi, Jungdam Won

General AI

Collaborative human-object interaction shows dynamic and complex movements that require mutual anticipation and continuous adjustment between participants and the shared object. Modeling such collaborative multi-human object interaction (MHOI) scenarios requires high-quality data acquisition as a foundational step; how…

Review: pending
Role: unreviewed
Read: soon

Open source Details

huggingface Score 4.5

RefGC-SR^2: Reference-guided Generated Content Super-Resolution and Refinement

2026-06-13 · Jeahun Sung, Dahyeon Kye, Soo Ye Kim, Jihyong Oh

General AI

Reference-guided generation (e.g., object compositing, customization) has progressed rapidly, yet current pipelines share a fundamental limitation: the object-centric high-resolution reference image (HRRI) provided by users is downsampled to a fixed low-resolution (LR) before being fed into the model, so the fine-grain…

Review: pending
Role: unreviewed
Read: later

Open source Details

arxiv Score 4.3

Ergodic Deviation-Robust Equilibrium under Mirror Descent Learning in Finite Games

2026-06-16 · Joshua Steier

General AI

We introduce Ergodic Deviation-Robust Equilibrium (EDRE), a dynamics-relative equilibrium concept for repeated finite games in which agents learn via entropic mirror descent (EMD). EDRE requires three properties to hold simultaneously for the same profile and learning run: (E1) the limit profile is an $\varepsilon$-Nas…

Review: pending
Role: unreviewed
Read: later

Open source Details

arxiv Score 4.3

Thermodynamic description of wealth inequality in the world

2026-06-16 · Klaus M. Frahm, Leonardo Ermann, Dima L. Shepelyansky

General AI

According to the recent Wealth Thermalization Hypothesis (WTH) the wealth inequality in the world is described by the Rayleigh-Jeans (RJ) thermal distribution of interacting agents in a society with social stratification. In this concept, the wealth layers of society are associated with energy levels from a nonlinear d…

Review: pending
Role: unreviewed
Read: later

Open source Details

arxiv Score 4.3

Treatment Response Optimized Clinical Decision Support AI System via Digital Twin Simulation

2026-06-16 · Xinyu Qin, Anil K. Sood, Ruiheng Yu, Sara Corvigno, Elaine Stur, Lu Wang

General AI

Clinical decision support AI systems (CDSASs) must adapt to evolving patient conditions in real-time while adhering to strict safety constraints. We present an online adaptive framework that integrates Treatment Effect (TE) estimation to quantify clinical benefits, a patient Digital Twin (DT) to simulate treatment traj…

Review: pending
Role: unreviewed
Read: later

Open source Details

arxiv Score 4.3

Variable-Width Transformers

2026-06-16 · Zhaofeng Wu, Oliver Sieberling, Shawn Tan, Rameswar Panda, Yury Polyanskiy, Yoon Kim

General AI

Scaling model size, specifically depth and width, has driven significant progress in transformer-based language models. However, most architectures maintain a constant width across all layers, allocating a fixed parameter and computation budget evenly despite different layers potentially playing distinct computational …

Review: pending
Role: unreviewed
Read: later

Open source Details

arxiv Score 4.3

Visual Verification Enables Inference-time Steering and Autonomous Policy Improvement

2026-06-16 · Mingtong Zhang, Dhruv Shah

General AI

Robots deployed in the real world should learn from their experience and improve over time. This requires a mechanism of practicing and learning from feedback. In this paper, we propose VERITAS, a generator-verifier framework for generalist robot policies for inference-time policy steering and self-improvement. We use …

Review: pending
Role: unreviewed
Read: later

Open source Details

Daily Archives

Research Workflow

Papers

No papers match the current view