arxiv
Score 26.0
2026-06-12 · Sina Hajimiri, Masih Aminbeidokhti, Jose Dolz, Ismail Ben Ayed, Issam H. Laradji, Spandana Gella, Nicolas Gontier
Research Track B · General AI
Online web agents often augment a base actor with memory, workflow, or skill modules. These modules can improve performance, but they also consume test-time tokens, a cost rarely reported alongside the actor's inference cost. We study online augmentation, where this overhead is paid on every task, and re-evaluate its b…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 24.3
2026-06-15 · Peiyang Xu, Bangzheng Li, Sijia Liu, Karthik R. Narasimhan, Pramod Viswanath, Prateek Mittal, Xingyu Fu
General AI
Large language models (LLMs) often fail when answering requires identifying a small but decisive piece of evidence within a long or complex context, such as a single line in a tool trace or a subtle detail in an image. We propose ContextRL, a context-aware reinforcement learning (RL) method that improves long-horizon r…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 20.3
2026-06-15 · Zhiqiang Zhou, Junliang Dai, Xu ling
General AI
Multimodal large language models (MLLMs) excel at visual reasoning but rely on text-based chain-of-thought (CoT), lacking interpretable visual intermediates. Existing methods use opaque tokens or external tools, missing key properties. We propose Gen-VCoT, a framework using expert vision models to generate RGB images a…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 19.0
2026-06-13 · Xinze Zhang
Research Track A · General AI
Visual perception of urban streetscapes underpins evidence-based decisions in landscape planning, public health, and place-making. Yet models trained on a few well-photographed metropolises systematically misjudge underrepresented districts, propagating geographic bias into downstream policy. We address this gap with H…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 19.0
2026-06-15 · Mao-Lin Luo, Yi-Lin Zhang, Zi-Hao Zhou, Yankun Hong, Xialiang Tong, Mingxuan Yuan, Tong Wei, Min-Ling Zhang
Research Track A · General AI
Continual learning for pre-trained vision-language models requires balancing three competing objectives: retaining pre-trained knowledge, preserving knowledge from a sequence of learned tasks, and maintaining the plasticity to acquire new knowledge. This paper presents KeepLoRA++, balancing these objectives through a u…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 18.5
2026-06-14 · Shuaike Zhang, Shaokun Wang, Haoyu Tang, Jianlong Wu, Liqiang Nie
Research Track A · General AI
Embodied Continual Learning (ECL) aims to enable robots to continually acquire new manipulation tasks while retaining previously learned behaviors under closed-loop control. Compared with conventional continual learning, ECL suffers from more severe catastrophic forgetting. Feature drift accumulated under closed-loop c…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 18.3
2026-06-15 · Truong Thanh Hung Nguyen, Khanh Van Quynh Nguyen, Hoang-Loc Cao, Tri Duong, Phuc Ho, Van Pham, Loc Nguyen, Hung Cao
General AI
Accurate Harmonized Tariff Schedule (HTS) code classification is essential for customs clearance, duty assessment, trade statistics, and regulatory compliance in maritime logistics. However, exact HTS classification remains challenging because product descriptions are often short, incomplete, or ambiguous, while correc…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 18.0
2026-06-15 · Anqi Zou, Han Deng, Chengyu Zhang, Junquan Hu, Yu Wang, Yuxiang Xing, Aokai Zhang, Hanling Zhang, Zhaoyang Liu, Ben Fei, Zhihui Wang, Wanli Ouyang
Research Track B · General AI
Current computer-use benchmarks primarily focus on software operation tasks in virtualized systems, whereas scientific instrumentation scenarios require coordinated control over complex interfaces, and feedback-driven parameter adjustment. However, directly evaluating agents on physical high-precision instruments is im…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.3
2026-06-15 · Anzhe Xie, Weihang Su, Yujia Zhou, Yiqun Liu, Qingyao Ai
General AI
Meta-analysis is a demanding form of evidence synthesis that combines literature retrieval, PI/ECO-guided study selection, and statistical aggregation. Its structured, verifiable workflow makes it an ideal substrate for evaluating systematic scientific reasoning, yet existing benchmarks lack ground truth across the ful…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.3
2026-06-15 · Minghang Zhu, Chuyang Wei, Junhao Xu, Yilin Cheng, Zhumin Chen, Jiyan He
General AI
Deep research agents synthesize long-form reports by searching and reasoning over retrieved evidence. Reinforcement learning with rubric-based rewards improves these agents by optimizing them against checkable criteria that translate report quality into reward signals, but its efficiency depends on whether those criter…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.3
2026-06-15 · Haonan Ge, Yiwei Wang, Hang Wu, Yujun Cai
Research Track A · General AI
Streaming video understanding models must answer queries at any moment during an ongoing stream, using only what they have observed so far and under fixed memory and computation budgets. Existing methods address this by adding memory banks, retrieval modules, or visual token compression to preserve long-range history. …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.0
2026-06-12 · Oxana Salish, Kuniyilh S
Research Track A
Internet of Things (IoT) and Cyber-physical systems (CPS) increasingly rely on continual learning (CL) to adapt to evolving environments, device heterogeneity, and concept drift, thereby improving overall utility. While continual adaptation is essential for long-lived IoT deployments where data patterns evolve, it also…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 16.5
2026-06-15 · Wei Xu, Ke Yang, Gang Luo, Keli Zheng, Lingyan Hu, Jing Wang, Kefeng Li
Research Track A · General AI
Predictive modeling for clinical tabular data is central to clinical decision support and therefore requires not only strong predictive performance but also transparent decision logic. Although deep learning and tree-based ensemble methods can achieve high accuracy, their black-box nature remains a major obstacle to cl…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.3
2026-06-14 · Ali Sarabadani, Mahtab Tajvidiyan
Research Track A · General AI
Large Language Models (LLMs) struggle to incorporate new knowledge without forgetting or costly retraining. We propose DYNA, a lightweight framework that augments a frozen LLM with a temporal knowledge graph where events are nodes and temporal relations are directed, timestamped edges. The graph serves as an external, …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.3
2026-06-15 · Sanjay Basu
General AI
Aggregate accuracy benchmarks conceal a systematic structure in how large language models fail at electronic health record (EHR) question answering: questions requiring more inferential steps produce disproportionately more errors. Motivated by theoretical results on transformer compositionality limits, we introduce a …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.3
2026-06-15 · Patomporn Payoungkhamdee, Napat Laosaengpha, Jenta Wonglertsakul, Pittawat Taveekitworachai, Pume Tuchinda, Panjapong Poobanchuen, Ekapol Chuangsuwanich, Can Udomcharoenchaikit, Samuel Cahyawijaya, Peerat Limkonchotiwat, Sarana Nutanong
General AI
Reasoning with a Code Interpreter (CI) has emerged as an effective paradigm for enhancing the reasoning capabilities of large language models (LLMs) through executable computation and iterative verification. Despite its growing adoption, the behavioral properties underlying effective code reasoning remain largely under…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.0
2026-06-12 · Salimeh Sekeh, Mary Wisell
Research Track A · General AI
Continual vision-language models are commonly addressed through sequential fine-tuning; however, although this paradigm enables adaptation to new environments (tasks), it inherently emphasizes the contribution of previously learned environments (tasks) at the expense of the stability required to preserve previously acq…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 14.3
2026-06-13 · Jing Jin, Robert Chu, Ning Yan, Masood S. Mortazavi
General AI
Large language models (LLMs) have facilitated impressive progress in software engineering, code generation, tooling, and systems. Concurrently, a significant body of research has developed which explores a growing variety of methods and systems for applying LLMs to hardware and chip design (e.g., systems for RTL code g…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 14.3
2026-06-15 · Dylan Banarse, Stephen Todd, William Latham, Frederic Fol Leymarie
General AI
This paper investigates the creative process of automated design and artistic evaluation using an evolutionary system. We consider how a multimodal artificial intelligence (AI) model can communicate and guide a combined generative and evolutionary computational system. This creates a framework for the evolution of aest…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 14.3
2026-06-15 · Y. H. Zhou, Z. M. Ma, Y. J. Zhou, Y. T. Li, H. X. Xiang, Y. M. Cheng, T. L. Chen, K. J. Zhang, Z. H. Nan, J. H. Ni, Z. Wu, Q. Y. Pan, S. Zhang, S. Cheng, M. Y. Luo
Research Track B · General AI
SMS fraud is increasingly cross-channel: a message directs the user to a webpage, and the final risk depends on how the SMS claim aligns with the page content and requested user action. However, existing evaluations either focus on message-only smishing classification or expose URL and domain cues that allow models to …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 13.3
2026-06-15 · Amr Mohamed, Guokan Shang, Michalis Vazirgiannis
General AI
Diffusion large language models (dLLMs) offer a promising alternative to autoregressive decoding by iteratively refining masked sequences, enabling parallel token updates and bidirectional conditioning. Their practical efficiency, however, is limited by sampling procedures that execute a fixed number of reverse denoisi…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 12.5
2026-06-15 · DreamX Team, Yancheng Bai, Rui Chen, Xiangxiang Chu, Rujing Dang, Hao Dou, Bingjie Gao, Qiwen Gu, Siyu Hong, Jiachen Lei, Geng Li, Jifan Li, Ruimin Lin, Qingfeng Shi, Bingze Song, Lei Sun, Jing Tang, Ruitian Tian, Jun Wang, Jiahong Wu, Pengfei Zhang, Shen Zhang, Jiashu Zhu
General AI
DreamX-World 1.0 is a general-purpose interactive text/image-to-video world model for controllable long-horizon generation. It supports camera navigation, revisits to previously observed regions, and promptable events across photorealistic, game-style, and stylized domains. Our data engine combines camera-accurate Unre…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.3
2026-06-15 · Buqiang Xu, Zirui Xue, Dianmou Chen, Chenyang Fu, Chiyu Wu, Caiying Huang, Chen Jiang, Jizhan Fang, Xinle Deng, Yijun Chen, Yunzhi Yao, Xuehai Wang, Jin Shang, Gong Yu, Ningyu Zhang
General AI
As LLM agents are deployed in long-horizon sessions, context accumulation drives up inference costs. Existing approaches utilize text pruning or dynamic memory eviction to minimize token footprints; however, their unconstrained sequence mutations alter layouts, introducing prefix mismatches and cache invalidation. This…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.0
2026-06-12 · Mary Isabelle Wisell, Nicholas Jacobs, Aayush Manandhar, Salimeh Yasaei Sekeh
Research Track A · General AI
Multi-source transfer learning faces a fundamental scalability bottleneck: existing approaches require either loading all K source models into memory simultaneously during parameter fusion, requiring O(K) memory, or deploying all models at inference time, making production deployment infeasible. We propose GRASP (Gradi…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.3
2026-06-15 · Mariam Elbakry, Aliaa Sayed Sheha, Salma Hassan Tantawy, Aya Yassin, Concetto Spampinato, Karim Lekadir, Xiaomeng Li, Marawan Elbatel
General AI
Multiphasic contrast-enhanced CT (CECT) is widely used for abdominal lesion characterization, yet it carries inherent risks of contrast-induced nephropathy, escalates acquisition burden, and heavily contributes to radiologist workload. To address these challenges, we introduce a novel multi-center benchmark for multi-o…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.3
2026-06-15 · Mehmet Iscan
General AI
Frozen small code models (<=1.5B parameters, run locally without fine-tuning) suit offline and privacy-constrained use, but often emit plausible-but-wrong programs. A natural remedy is a post-hoc operator that selects, verifies, repairs, or re-processes the model's samples without retraining; in principled form it is P…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 11.2
2026-04-08 · Jiwan Chung, JiHyuk Byun, Vibhav Vineet, Seon Joo Kim
Research Track B · General AI
Web agents act through long interaction sequences, yet existing benchmarks evaluate only terminal success, discarding all process information and offering little guidance on improvement. In this work, we conduct a process-level analysis of web agents. We introduce WebStep, a benchmark of 1,800 task instances with contr…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 10.5
2026-06-11 · Hexuan Yu, Chaoyu Zhang, Heng Jin, Shanghao Shi, Ning Zhang, Y. Thomas Hou, Wenjing Lou
Research Track B · General AI
Modern LLM-powered autonomous agents increasingly rely on rich user interface (UI) state observations to achieve reliable action grounding in complex digital environments. However, many deployments transmit the full UI state to remote inference servers even when most elements are irrelevant to the current task, which c…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 10.5
2026-06-12 · Chenxin Li, Zhengyao Fang, Zhengyang Tang, Pengyuan Lyu, Xingran Zhou, Xin Lai, Fei Tang, Liang Wu, Yiduo Guo, Weinong Wang, Junyi Li, Yi Zhang, Yang Ding, Huawen Shen, Sunqi Fan, Shangpin Peng, Zheng Ruan, Anran Zhang, Benyou Wang, Chengquan Zhang, Han Hu
General AI
Phone agents are increasingly expected to complete real mobile workflows rather than merely predict the next screen action. However, much of the current mobile-agent literature still evaluates agents primarily as GUI controllers that observe a screen, emit taps and swipes, and are scored by target app state. Real phone…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 10.5
2026-06-14 · Nafiseh Nikeghbal, Amir Hossein Kargaran, Shaghayegh Kolli, Jana Diesner
General AI
Standard accuracy benchmarks are designed to test how closely large language models (LLMs) approach correct answers, but are not suitable for testing whether LLMs stick with a correct answer when that answer is challenged by a plausible counter-argument. We introduce a controlled protocol for evaluating answer stabilit…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 10.3
2026-06-14 · Claudio Fantinuoli
General AI
Machine interpreting (MI), the live, real-time branch of speech translation, has achieved remarkable progress on standard benchmarks, with some systems approaching human parity on textual fidelity. Yet the user experience remains far inferior to interpreter-mediated communication, revealing what we term the \emph{accur…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 10.3
2026-06-15 · Hamidah Oderinwale
General AI
Benchmark scores tell you what an agent got right; they do not tell you how it got there. In this work, we introduce methods for comparing agents procedurally in different contexts, where the model, tasks, and approaches vary. We compare ten agents and find that they are identifiable by their behavioral habits, which w…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 10.3
2026-06-15 · Yanan Long
General AI
Public AI evaluations are often read as terminal leaderboards, yet the underlying evidence is a selective time series shaped by reporting rules, benchmark revisions, and missingness. Repeated public archives for LiveBench and Open LLM Leaderboard v2 serve as the primary longitudinal record; LMArena provides a preferenc…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 10.3
2026-06-15 · Jisang Han, Seonghu Jeon, Jaewoo Jung, René Zurbrügg, Honggyu An, Tifanny Portela, Marco Hutter, Marc Pollefeys, Seungryong Kim, Sunghwan Hong
General AI
Generalist robot policies must follow user instructions while reasoning about how objects, cameras, and robot actions interact in the 3D physical world. Recent vision-language-action models (VLAs) and video world-action models (WAMs) inherit strong semantic or temporal priors from large-scale foundation models, but the…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 9.5
2026-06-15 · Yinhan He, Liam Collins, Bhuvesh Kumar, Jundong Li, Neil Shah, Donald Loveland
General AI
Large Language Models (LLMs) are increasingly adopted as backbones for Generative Recommendation (GR), promising access to pretrained world knowledge. Yet reliably invoking this knowledge for GR remains poorly understood. A key obstacle is that LLM-based GR typically represents items with Semantic IDs (SIDs), disruptin…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 9.5
2026-06-15 · Shuai Yang, Bingjie Gao, Ziwei Liu, Jiaqi Wang, Dahua Lin, Tong Wu
General AI
Consistent video generation under editing operations requires persistence: when edits modify scene appearance or layout, subsequent generations should remain coherent across time and viewpoints. However, existing memory designs struggle to maintain long-term consistency after such modifications, as stored contexts may …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 9.3
2026-06-13 · Yuhong Jiang, Zhishu Shen, Tong Yin, Qiushi Zheng, Yichao Jin, Fidan Mehmeti, Jiong Jin
General AI
The rapid growth of remote sensing data in Low Earth Orbit (LEO) satellite networks is increasingly constrained by limited downlink capacity to terrestrial networks. Satellite edge computing alleviates this pressure by enabling in-orbit data processing. However, it introduces a new challenge of spatio-temporal resource…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 9.3
2026-06-15 · Xiuwei Xu, Haowen Sun, Angyuan Ma, Yiwei Zhang, Zhenyu Wu, Xiaofeng Wang, Bingyao Yu, Zheng Zhu, Jie Zhou, Jiwen Lu
General AI
Spatial generalization is critical for imitation-learned manipulation policies, but achieving it typically requires scaling demonstrations across diverse object poses, robot configurations, and camera viewpoints. Data augmentation from a few source demonstrations offers a practical alternative to costly real-world coll…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 9.3
2026-06-15 · Naiyu Yin, Dennis Wei, Tian Gao, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy, Yue Yu
General AI
A prominent research direction in mechanistic interpretability is learning sparse circuits over LLM components to reveal how they jointly produce model behavior. However, raw neurons are polysemantic, making learned circuits hard to interpret. Sparse autoencoder (SAE) features alleviate this, but their high dimensional…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 9.0
2026-06-13 · Ida Momennejad, Roberta Raileanu
Research Track A
Open-ended intelligence is the capacity to adapt to novel problems and environments that are substantially different from those in training. A mathematics of open-ended intelligence requires two pillars: first, a minimal set of representational primitives (e.g., states, actions) and algorithmic primitives (e.g., neares…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 9.0
2026-06-14 · Fendi Tsim, Alina Gutoreva
Research Track A
We introduce SCAN -- a human-centric decision-making framework to facilitate learners for effective task allocation with Generative Artificial Intelligence (GenAI) based on Vygotsky's Zone of Proximal Development and Metacognition. In SCAN, we systematize and formalize AI-human interaction by introducing a task-identif…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 8.5
2026-06-10 · Dingyu Yao, Junhao Zhou, Chenxu Yang, Chuanyu Qin, Haowen Hou, Zheming Liang, Congcong Wang, Yuhang Cao, Shenglong Ye, Shuai Xie, Shuhuan Gu, Haoyang Huang, Qingyi Si, Nan Duan, Jiaqi Wang
General AI
Many moments in the real world do not wait for a user to ask. A fire starts on a security monitor, an expression flickers across a video call, or a product a viewer wants flashes by in a livestream. Yet today's large models remain mostly turn-based by design: they answer only when addressed, and even video-call apps th…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.3
2026-06-12 · Huong Binh Vu
General AI
Rapid post-event landslide mapping is essential for disaster response but remains difficult to automate due to extreme class imbalance. This study evaluates whether Clay v1.5, a Geospatial Foundation Model (GFM), can improve pixel-level landslide segmentation on the Landslide4Sense (L4S) benchmark, which contains 3,799…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.3
2026-06-13 · Daksh Mittal, Tommaso Castellani, Thomson Yen, Naimeng Ye, Fangyu Wu, Minghui Chen, Tiffany Cai, Emmanouil Koukoumidis, William Zeng, Hongseok Namkoong
General AI
We envision continually learning agentic systems that become more useful over time: as they encounter sequences of related tasks, they should infer the hidden structure shared across those tasks and use it to improve future decisions. This cross-task experiential learning capability is pivotal in domains such as person…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.3
2026-06-14 · Xiongjun Guan, Jianjiang Feng, Jie Zhou
Research Track A · General AI
Small-area fingerprint sensing on mobile devices creates a fundamental mismatch between acquisition and recognition: each touch captures only a tiny, pose-varying local patch, while reliable biometric matching ultimately requires a stable and sufficiently complete fingerprint representation. Existing pipelines largely …
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.3
2026-06-15 · Violet Xiang, Amrith Setlur, Chase Blagden, Nick Haber, Aviral Kumar
General AI
Sparse reward reinforcement learning (RL) has become a standard tool for improving LLM reasoning, but its success depends critically on the coverage present in the base model. In practice, models are often primed for RL through \emph{mid-training} on curated reasoning traces that teach useful primitive skills such as d…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 8.0
2026-06-15 · Hyungmin Kim, Minsoo Kim, Hongseok Kim, Jungwook Choi
Research Track A · General AI
Multi-turn LLM serving accumulates dialogue history whose Key-Value (KV) cache grows with every turn and every user, quickly exceeding the model weights themselves and making memory -- not compute -- the binding constraint on throughput. Non-uniform KV compression, which allocates heterogeneous budgets across attention…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 7.5
2026-04-14 · Sha Sajadieh, Loredana Fattorini, Raymond Perrault, Yolanda Gil, Vanessa Parli, Lapo Santarlasci, Juan Pava, Nestor Maslej, Russ Altman, Erik Brynjolfsson, Carla Brodley, Jack Clark, Virginia Dignum, Vipin Kumar, James Landay, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Elham Tabassi, Russell Wald, Toby Walsh, Dan Weld
General AI
Welcome to the ninth edition of the AI Index report. As AI continues to advance rapidly, the question becomes whether the systems built around it can keep up. Governance frameworks, evaluation methods, education systems, and the data infrastructure needed to track AI's impact are struggling to match the pace of the tec…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 7.5
2026-06-02 · Sanket Badhe, Deep Shah
General AI
Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational overhead. To addres…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.5
2026-06-14 · Qing Su, Kaiyang Li, Yuan Zhuang, Fei Miao, Shihao Ji
Research Track A · General AI
While video segmentation has advanced rapidly on short clips and closed-set benchmarks, open-world video segmentation remains largely unexplored. The challenge is twofold: (1) existing methods are not designed to support object discovery and identity maintenance in long videos of dynamic ego-motion, and (2) existing ev…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.3
2026-06-11 · Elijah Cadenhead, Cristian McGee, Xin Li, El Houcine Bergou, Aritra Dutta
General AI
Low-rank adaptation (LoRA) and its variants provide a memory- and compute-efficient alternative to full fine-tuning of pre-trained models. However, questions remain about the comparative generalizability of these approaches and how the structural restrictions on low-rank updates preserve effective adaptation performanc…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.3
2026-06-15 · Amdjed Belaref, Samir Sadok, Zineb Noumir, Renaud Seguier
General AI
Affective computing increasingly relies on deep learning to represent emotions, yet latent spaces often remain opaque, high-dimensional black boxes. This paper investigates whether Transformers' embeddings recover the geometric regularities of Russell's circumplex model. We unify two complementary experiments testing t…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.3
2026-06-15 · Abbas Mammadov, Ozgur Kara, Kaan Oktay, Iskander Azangulov, Adil Kaan Akan, Hyungjin Chung, James Matthew Rehg, Yee Whye Teh
General AI
Diffusion and flow-based models learn powerful data priors by training a denoiser to reverse Gaussian corruption. To use this prior to solve a linear inverse problem, one needs to sample from the posterior, but the score that the prior provides is the unconditional score, not the posterior score. Existing methods eithe…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.3
2026-06-15 · Xueluan Gong, Chen Chen, Jinxin Liu, Qian Wang, Kwok-Yan Lam
General AI
Foundation models are reshaping robotics by enabling robots to interpret open-ended instructions, reason over multimodal contexts, and operate in complex, open-world environments. However, their integration also introduces security and privacy (S&P) risks that extend beyond the FMs themselves to embodied execution pipe…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 6.5
2026-06-15 · Sean Man, Ron Raphaeli, Matan Kleiner, Or Ronai
General AI
In this paper, we introduce SP^3, a novel Plug-and-Play algorithm that accelerates maximum a posteriori image restoration by replacing denoisers with Spherical Encoders (SE) as generative priors. SP^3 approximates the intractable proximal prior step by utilizing the SE tightly structured latent space as a robust projec…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 6.3
2026-06-15 · Kevin Yuanbo Wu, Tianxing Zhou, Isaac Tu, Billy Yan, Irmak Guzey, David Fouhey, Dandan Shan, Lerrel Pinto
General AI
Humans can grasp objects effortlessly, whereas multi-fingered robots are far from this level of generality. We argue that the most natural source of robot grasping data is from humans, who pick up thousands of objects every day. We present HUG, a flow-matching model that generates diverse human grasps for any user-spec…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 6.3
2026-06-15 · Kareem Amin, Rudrajit Das, Alessandro Epasto, Adel Javanmard, Dennis Kraft, Mónica Ribero, Sergei Vassilvitskii
General AI
The rapid adoption of generative AI and Large Language Models (LLMs) has spurred interest in synthetic data as a privacy-preserving alternative to sensitive real-world datasets. However, generating high-utility synthetic data often carries the risk of memorizing and regurgitating private information from the training c…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 5.5
2026-06-12 · Igor Itkin
General AI
A content-moderation system can score well on every standard accuracy metric and still cause real harm, if its mistakes fall on the few users who connect otherwise separate communities. We show this in an agent-based model where N=240 learning agents on a community-structured network each post harmless, productive, or …
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 5.3
2026-06-15 · Junghun Oh, Sungyong Baik, Kyoung Mu Lee
General AI
Low-Rank Adaptation (LoRA) enables efficient adaptation of large pre-trained models to downstream tasks by parameterizing weight updates with low-rank matrices. In this paper, we investigate the limitations of the LoRA parameterization from a geometric perspective. Specifically, we show that when a full fine-tuning gra…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 5.3
2026-06-15 · Mingyang Li, Yurou Liu, Jieping Ye, Bing Su, Ji-Rong Wen, Zheng Wang
General AI
In this report, we present LOGOS (Language Of Generative Objects in Science), a scientific generative language model that unifies heterogeneous tasks across the natural sciences within a single autoregressive framework based on a shared scientific grammar. It encodes diverse scientific objects and their spatial interac…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 4.5
2026-06-12 · Xuan Wei, Longbin Ji, Guan Wang, Xiangrui Liu, Zhenyu Zhang, Shuohuan Wang, Yu Sun, Qingqi Hong
General AI
Long-form video generation requires recurring subjects to remain consistent across various shots, viewpoints, motions, and scene transitions. Existing temporal decomposition methods improve scalability by generating videos shot by shot. However, they mainly focus on optimizing plausible next-shot continuations without …
- Review
- pending
- Role
- unreviewed
- Read
- later
huggingface
Score 4.5
2026-06-15 · Haotian Liu, Yihao Liu, Jingwei Ni, Siyuan Huang, Xinpeng Liu, Pengyu Cheng, Jiajun Song, Ruijin Ding, Junfeng Li, Zhechao Yu, Mengyu Zhou, Hongteng Xu, Xiaoxi Jiang, Guanjun Jiang
General AI
As LLMs advance, post-training reinforcement learning (RL) increasingly relies on multi-dimensional rewards to cultivate comprehensive capabilities. This shift demands new algorithms capable of optimizing diverse and potentially competing objectives simultaneously. To address this, existing methods such as Group reward…
- Review
- pending
- Role
- unreviewed
- Read
- later
huggingface
Score 4.5
2026-06-15 · Yagmur Akarken, Orest Kupyn, Christian Rupprecht
General AI
Diffusion transformers have demonstrated remarkable generative capabilities, yet the rich perceptual representations computed across their denoising trajectory are discarded once the content is rendered. We present MMDiff, a framework that transforms a frozen diffusion transformer into a multi-modal generative system t…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 4.3
2026-06-11 · Xiaomeng Yang, Yanyu Li, Gordon Guocheng Qian, Ivan Skorokhodov, Viacheslav Ivanov, Avalon Vinella, Xuan Zhang, Yanzhi Wang, Sergey Tulyakov, Anil Kag
General AI
Personalizing Image-to-Video (I2V) diffusion models with specific visual effects is increasingly demanded for high-end video generation. Current practice requires training a separate Low-Rank Adaptation (LoRA) module for each effect, incurring substantial data curation and iterative optimization costs that hinder inter…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 4.3
2026-06-13 · Felix Stillger, Ben Hamscher, Lukas Hahn, Annika Mütze, Tobias Meisen, Kira Maag
General AI
Semantic segmentation is a fundamental component of visual perception in modern automotive systems, enabling pixel-level scene understanding. Near-Infrared imaging (NIR) offers stable detection under difficult illumination conditions, but the development of domain-specific semantic segmentation models remains challengi…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 4.3
2026-06-15 · Philippe Page, Robert Mitwicki, Michal Pietrus
General AI
The 2023 paper \emph{Distributed Governance: a Principal-Agent Approach to Data Governance} arXiv:2308.07280 introduced the autonomous principal as the locus of transactional sovereignty in digital ecosystems. This follow-up, Part 2, advances a structural argument for why that model is not a normative preference but a …
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 4.3
2026-06-15 · Nick Jiang, Isaac Kauvar, Jack Lindsey
General AI
We investigate whether language models internally track the value of their current trajectory, defined as the likelihood that their ongoing strategy will achieve their goals. Using synthetic, in-context reinforcement learning data, we construct a "value" axis for Qwen3-8B. We find that activations along this axis disti…
- Review
- pending
- Role
- unreviewed
- Read
- later