arxiv
Score 21.2
2026-05-01 · Derong Xu, Shuochen Liu, Pengfei Luo, Pengyue Jia, Yingyi Zhang, Yi Wen, Yimin Deng, Wenlin Zhang, Enhong Chen, Xiangyu Zhao, Tong Xu
General AI
Large language model (LLM) agents require long-term user memory for consistent personalization, but limited context windows hinder tracking evolving preferences over long interactions. Existing memory systems mainly rely on static, hand-crafted update rules; although reinforcement learning (RL)-based agents learn memor…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 19.2
2026-05-01 · Siyuan Huang, Xiaoye Qu, Yafu Li, Tong Zhu, Zefeng He, Muxin Fu, Daizong Liu, Wei-Long Zheng, Yu Cheng
General AI
While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a "Visual Signal Dilution" phenomenon, where the accumulation of textual history expands the attention partition function, causing visual attention to decay inversely with generated sequence lengt…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.2
2026-05-01 · Yawen Qin, Ke Qiu, Qin Zhang
General AI
Dance serves as both a cultural cornerstone and a medium for personal expression, yet the rapid growth of online dance content has made personalized discovery increasingly difficult. Text-based dance retrieval offers a natural interface for users to search with choreographic intent, but it remains underexplored because…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.2
2026-05-01 · Arunabh Srivastava, Mohammad A., Khojastepour, Srimat Chakradhar, Sennur Ulukus
General AI
Humans solve problems by executing targeted plans, yet large language models (LLMs) remain unreliable for structured workflow execution. We propose RunAgent, a multi-agent plan execution platform that interprets natural-language plans while enforcing stepwise execution through constraints and rubrics. RunAgent bridges …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 16.2
2026-05-01 · Ziyang Huang, Yi Cao, Ali K. Shargh, Jing Luo, Ruidong Mei, Mohd Zaki, Zhan Liu, Wyatt Bunstine, William Jurayj, Somdatta Goswami, Tyrel McQueen, Michael Shields, Jaafar El-Awady, Paulette Clancy, Benjamin Van Durme, Nicholas Andrews, William Walden, Daniel Khashabi
General AI
Large language models are increasingly deployed as autonomous coding agents and have achieved remarkably strong performance on software engineering benchmarks. However, it is unclear whether such success transfers to computational scientific workflows, where tasks require not only strong coding ability, but also the ab…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 16.2
2026-05-01 · Xihao Chen, Yangyang Guo, Roger Zimmermann
General AI
Key-Value (KV) cache has become a de facto component of modern Large Vision-Language Models (LVLMs) for inference. While it enhances decoding efficiency in Large Language Models (LLMs), its direct adoption in LVLMs introduces substantial GPU memory overhead due to the large number of vision tokens processed during the …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.2
2026-05-01 · Saeid Jamshidi, Foutse Khomh, Carol Fung, Kawser Wazed Nafi
General AI
The adoption of Internet of Things (IoT) systems at the network edge of smart architectures is increasing rapidly, intensifying the need for security mechanisms that are both adaptive and resource-efficient. In such environments, runtime defence mechanisms are no longer limited to detection alone but become a resource-…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 15.0
2026-04-27 · Qiliang Liang, Hansi Wang, Zhong Liang, Yang Liu
General AI
LLM agents increasingly rely on reusable skills, capability packages that combine instructions, control flow, constraints, and tool calls. In most current agent systems, however, skills are still represented by text-heavy artifacts, including SKILL.md-style documents and structured records whose machine-usable evidence…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 13.4
2026-05-01 · Zi-Bo Qin, Feng-Feng Wei, Tai-You Chen, Wei-Neng Chen
General AI
Distributed blackbox consensus optimization is a fundamental problem in multi-agent systems, where agents must improve a global objective using only local objective queries and limited neighbor communication. Existing methods largely rely on handcrafted update rules and static cooperation patterns, which often struggle…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 13.2
2026-05-01 · Yan Fang, Mengcheng Lan, Zilong Huang, Weixian Lei, Yunqing Zhao, Yujie Zhong, Yingchen Yu, Qi She, Yao Zhao, Yunchao Wei
General AI
In this paper, we present \textbf{Gen}erative \textbf{L}anguage-\textbf{I}mage \textbf{P}re-training (GenLIP), a minimalist generative pretraining framework for Vision Transformers (ViTs) designed for multimodal large language models (MLLMs). To better align vision encoders with the autoregressive nature of LLMs, GenLI…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 13.2
2026-05-01 · Yuan Li, Jun Hu, Jiaxin Jiang, Bryan Hooi, Bingsheng He
General AI
Multimodal data plays a critical role in web-based recommendation systems, where information from diverse modalities such as vision and text enhances representation learning. However, real-world multimodal datasets often suffer from modality incompleteness due to sensor failures, annotation scarcity, or privacy constra…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 13.2
2026-05-01 · Sailesh Panda, Pritam Kadasi, Abhishek Upperwal, Mayank Singh
General AI
Large language models (LLMs) often achieve strong performance on reasoning benchmarks, but final-answer accuracy alone does not show whether they faithfully execute the procedure specified in a prompt. We study this question through a controlled diagnostic benchmark for procedural execution, where models are given a st…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.9
2026-05-01 · Dongxin Guo, Jikun Wu, Siu Ming Yiu
Research Track B · General AI
AI agents execute tens to hundreds of chained LLM calls per task, yet GPU schedulers treat each call as independent, discarding gigabytes of intermediate state between steps and inflating end-to-end latency by 3-8x. We argue that this request-level abstraction is fundamentally mismatched to compound AI workloads, and p…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.2
2026-05-01 · Pavlin G. Poličar, Andraž Pevcin, Blaž Zupan
General AI
Generating diverse, readable statistical charts from tabular data remains challenging for LLMs, as many failures become apparent after rendering and are not detectable from data or code alone. Existing chart datasets also rarely provide fully aligned artifacts, such as executable code, dataset context, and question-ans…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 11.4
2026-05-01 · Indraneil Paul, Glavaš Glavas, Iryna Gurevych
General AI
Reward models (RMs) have become an indispensable fixture of the language model (LM) post-training playbook, enabling policy alignment and test-time scaling. Research on the application of RMs in code generation, however, has been comparatively sparse, with existing work largely focusing on execution feedback. This choi…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.2
2026-05-01 · Yutao Hou, Yihan Jiang, Yuhan Xie, Jian Yang, Liwen Zhang, Hailiang Huang, Guanhua Chen, Yun Chen
General AI
Large language models (LLMs) are increasingly applied in financial scenarios. However, they may produce harmful outputs, including facilitating illegal activities or unethical behavior, posing serious compliance risks. To systematically evaluate LLM safety in finance, we propose FinSafetyBench, a bilingual (English-Chi…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.2
2026-05-01 · Theodore Papamarkou, Pierre Alquier, Matthias Bauer, Wray Buntine, Andrew Davison, Gintare Karolina Dziugaite, Maurizio Filippone, Andrew Y. K. Foong, Vincent Fortuin, Dimitris Fouskakis, Jes Frellsen, Eyke Hüllermeier, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Nikita Kotelevskii, Salem Lahlou, Yingzhen Li, Fang Liu, Clare Lyle, Thomas Möllenhoff, Konstantina Palla, Maxim Panov, Yusuf Sale, Kajetan Schweighofer, Artem Shelmanov, Siddharth Swaroop, Martin Trapp, Willem Waegeman, Andrew Gordon Wilson, Alexey Zaytsev
General AI
LLMs excel at predictive tasks and complex reasoning tasks, but many high-value deployments rely on decisions under uncertainty, for example, which tool to call, which expert to consult, or how many resources to invest. While the usefulness and feasibility of Bayesian approaches remain unclear for LLM inference, this p…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 11.0
2026-04-25 · Yihan Wang, Lei Li, Yao Lai, Jing Wang, Yan Lu
General AI
Analog circuit design relies heavily on reusing existing intellectual property (IP), yet searching across heterogeneous representations such as SPICE netlists, schematics, and functional descriptions remains challenging. Existing methods are largely limited to exact matching within a single modality, failing to capture…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 10.2
2026-05-01 · Jinpai Zhao, Nishant Panda, Yen Ting Lin, Eirik Valseth, Diane Oyen, Clint Dawson
General AI
We introduce HyCOP, a modular framework that learns parametric PDE solution operators by composing simple modules (advection, diffusion, learned closures, boundary handling) in a query-conditioned way. Rather than learning a monolithic map, HyCOP learns a policy over short programs - which module to apply and for how l…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 9.4
2026-05-01 · Minghui Chen, Chenxu Yang, Hengjie Zhu, Dayan Wu, Zheng Lin, Qingyi Si
General AI
Large Vision-Language Models (LVLMs) often suffer from hallucinations, generating descriptions that include visual details absent from the input image. Recent preference alignment methods typically rely on supervision distilled from stronger models such as GPT. However, this offline paradigm introduces a Supervision-Pe…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 9.4
2026-05-01 · Houyuan Chen, Hong Li, Xianghao Kong, Tianrui Zhu, Shaocong Xu, Weiqing Xiao, Yuwei Guo, Chongjie Ye, Lvmin Zhang, Hao Zhao, Anyi Rao
General AI
Recent progress has shown that video diffusion models (VDMs) can be repurposed for diverse multimodal graphics tasks. However, existing methods often train separate models for each problem setting, which fixes the input-output mapping and limits the modeling of correlations across modalities. We present UniVidX, a unif…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-05-01 · Lin Che, Xi Wang, Marc Pollefeys, Konrad Schindler, Martin Raubal, Peter Kiefer
General AI
Urban perception describes how people subjectively evaluate urban environments, shaping how cities are experienced and understood. Existing computational approaches primarily model urban perception directly from street view images, but largely ignore the human perceptual process through which such judgments are formed.…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-05-01 · Alfredo Madrid-García, Miguel Rujas
General AI
Background: Patient-facing medical chatbots based on retrieval-augmented generation (RAG) are increasingly promoted to deliver accessible, grounded health information. AI-assisted development lowers the barrier to building them, but they still demand rigorous security, privacy, and governance controls. Objective: To re…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 8.0
2026-04-26 · Zhen Ye, Xu Tan, Aoxiong Yin, Hongzhan Lin, Guangyan Zhang, Peiwen Sun, Yiming Li, Chi-Min Chan, Wei Ye, Shikun Zhang, Wei Xue
General AI
Joint audio-video generation models have shown that unified generation yields stronger cross-modal coherence than cascaded approaches. However, existing models couple modalities throughout denoising via pervasive attention, treating high-level semantics and low-level details in a fully entangled manner. This is subopti…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.2
2026-05-01 · Zihao Ding, Beining Wu, Jun Huang
General AI
Federated Multimodal Learning (FML) trains multimodal models across decentralized clients while keeping their image-text pairs private. However, joint embedding training entangles forgotten knowledge across both modalities and client gradient subspaces, hindering federated unlearning. Previous federated unlearning appr…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.2
2026-05-01 · Xinyuan Zhao, Yihang Wu, Ahmad Chaddad, Sarah A. Alkhodair, Reem Kateb
General AI
Gaze estimation methods commonly use facial appearances to predict the direction of a person gaze. However, previous studies show three major challenges with convolutional neural network (CNN)-based, transformer-based, and contrastive language-image pre-training (CLIP)-based methods, including late fusion of image feat…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.2
2026-05-01 · Shradha Sharma, Swapnil Dhamal, Shweta Jain
General AI
We propose a new framework for meritocratic fairness in budgeted combinatorial multi-armed bandits with full-bandit feedback (BCMAB-FBF). Unlike semi-bandit feedback, the contribution of individual arms is not received in full-bandit feedback, making the setting significantly more challenging. To compute arm contributi…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 6.2
2026-05-01 · Bruce Rushing, Angela Danquah, Alireza Namazi, Arjun Dirghangi, Heman Shakeri
General AI
Effectively stratifying patient risk in chronic diseases like glaucoma is a major clinical challenge. Clinicians need tools to identify patients at high risk of progression from sparse and irregularly-sampled electronic health records (EHRs). We propose a novel deep kernel learning (DKL) architecture that leverages a G…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 6.2
2026-05-01 · Sizhe Tang, Zuyuan Zhang, Mahdi Imani, Tian Lan
General AI
Monte Carlo Tree Search (MCTS) scales poorly in cooperative multi-agent domains because expansion must consider an exponentially large set of joint actions, severely limiting exploration under realistic search budgets. We propose NonZero, which keeps multi-agent MCTS tractable by running surrogate-guided selection over…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 6.2
2026-05-01 · Guandong Li, Mengxia Ye
General AI
Image editing instructions are heterogeneous: a color swap, an object insertion, and a physical-action edit all demand different spatial coverage and different reasoning depth, yet existing reasoning-based editors apply a single fixed inference recipe to every instruction. We argue that adaptivity along both the spatia…
- Review
- pending
- Role
- unreviewed
- Read
- later
huggingface
Score 5.4
2026-05-01 · Yi Wang, Xinchen Li, Pengwei Xie, Pu Yang, Buqing Nie, Yunuo Cai, Qinglin Zhang, Chendi Qu, Jeffrey Wu, Jianheng Song, Xinlin Ren, Jingshun Huang, Mingjie Pan, Siyuan Feng, Zhi Chen, Jianlan Luo
General AI
Generalist robot policies increasingly benefit from large-scale pretraining, but offline data alone is insufficient for robust real-world deployment. Deployed robots encounter distribution shifts, long-tail failures, task variations, and human correction opportunities that fixed demonstration datasets cannot fully capt…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 5.2
2026-05-01 · Zizhong Yan, Jingrong Li, Yi Zhang
General AI
Estimating network formation models with degree heterogeneity raises two problems in empirical networks. First, agents that send no links, receive no links, or link to all remaining agents can make the fixed-effects MLE fail to exist. Trimming these agents changes the estimation sample and induces selection bias. Secon…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 5.2
2026-05-01 · George Stoica, Sayak Paul, Matthew Wallingford, Vivek Ramanujan, Abhay Nori, Winson Han, Ali Farhadi, Ranjay Krishna, Judy Hoffman
General AI
Flow matching (FM) trains a time-dependent vector field that transports samples from a simple prior to a complex data distribution. However, for high-dimensional images, each training sample supervises only a single trajectory and intermediate point, yielding an extremely sparse and high-variance training signal. This …
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 5.2
2026-05-01 · Laurent Hébert-Dufresne, Antoine Allard, Jean-Gabriel Young, William H. W. Thompson, Guillaume St-Onge
General AI
Complex contagions describe systems where the probability or rate of contagious transmission is a nonlinear function of the exposure to contagious agents. These models were first studied theoretically but have since been used to capture effects such as nonconformism, social reinforcement or peer pressure in empirical d…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 5.2
2026-05-01 · Jingxi Pu, Tonghua Liu, Zhilin Guan, Siqiao Li, Yang Ming, Zheng Cong, Wei Zhang, Fangwei Li
General AI
With the development of deep learning, medical image processing has been widely used to assist clinical research. This paper focuses on the denoising problem of low-dose computed tomography using deep learning. Although low-dose computed tomography reduces radiation exposure to patients, it also introduces more noise, …
- Review
- pending
- Role
- unreviewed
- Read
- later