arxiv
Score 24.2
2026-04-30 · Jing Zhang, Wentao Jiang, Tao Huang, Zhiwei Wang, Jianxin Liu, Jian Chen, Ping Ye, Gang Wang, Zengmao Wang, Bo Du, Dacheng Tao
General AI
Ultrasound interpretation requires both precise lesion localization and holistic clinical reasoning, yet existing methods typically excel at only one of these capabilities: specialized detectors offer strong localization but limited reasoning, whereas multimodal large language models (MLLMs) provide flexible reasoning …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 21.9
2026-04-29 · Fazle Elahi Faisal, Qianhui Wu, Baolin Peng, Jianfeng Gao
Research Track B · General AI
Recent advances in multimodal large language models (LLMs) have revolutionized web agents that can automate complex tasks on websites. However, their accuracy remains limited by the scarcity of high-quality web trajectory training data. Existing automatic trajectory generation methods suffer from incomplete website cov…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 20.2
2026-04-30 · Eyon Jang, Damon Falck, Joschka Braun, Nathalie Kirch, Achu Menon, Perusha Moodley, Scott Emmons, Roland S. Zimmermann, David Lindner
General AI
Reinforcement learning (RL) has become essential to the post-training of large language models (LLMs) for reasoning, agentic capabilities and alignment. Successful RL relies on sufficient exploration of diverse actions by the model during training, which creates a potential failure mode: a model could strategically alt…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 19.4
2026-04-29 · Qisheng Hu, Quanyu Long, Wenya Wang
Research Track A · General AI
Memory-augmented LLM agents offer an appealing shortcut to continual learning: rather than updating model parameters, they accumulate experience in external memory, seemingly sidestepping the stability-plasticity dilemma of parametric learning. We show that this challenge does not disappear but resurfaces at the memory…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 19.4
2026-04-30 · Zihao Li, Jiaru Zou, Feihao Fang, Xuying Ning, Mengting Ai, Tianxin Wei, Sirui Chen, Xiyuan Yang, Jingrui He
General AI
Agentic large language model systems have demonstrated strong capabilities. However, their reliance on language as the universal interface fundamentally limits their applicability to many real-world problems, especially in scientific domains where domain-specific foundation models have been developed to address special…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 19.2
2026-04-30 · Bo Zhang, Tzu-Yen Ma, Zichen Tang, Junpeng Ding, Zirui Wang, Yizhuo Zhao, Peilin Gao, Zijie Xi, Zixin Ding, Haiyang Sun, Haocheng Gao, Yuan Liu, Liangjia Wang, Yiling Huang, Yujie Wang, Yuyue Zhang, Ronghui Xi, Yuanze Li, Jiacheng Liu, Zhongjun Yang, Haihong E
General AI
We introduce AEGIS, A holistic benchmark for Evaluating forensic analysis of AI-Generated academic ImageS. Compared to existing benchmarks, AEGIS features three key advances: (1) Domain-Specific Complexity: covering seven academic categories with 39 fine-grained subtypes, exposing intrinsic forensic difficulty, where e…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 19.2
2026-04-30 · Jialu Shen, Han Lyu, Suyang Zhong, Hanzheng Li, Haoyi Tao, Nan Wang, Changhong Chen, Xi Fang
General AI
Spectra are a prevalent yet highly information-dense form of scientific imagery, presenting substantial challenges to multimodal large language models (MLLMs) due to their unstructured and domain-specific characteristics. Here we introduce SpecVQA, a professional scientific-image benchmark for evaluating multimodal mod…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 18.2
2026-04-30 · Binyan Xu, Xilin Dai, Kehuan Zhang
Research Track A · General AI
Current agentic memory systems (vector stores, retrieval-augmented generation, scratchpads, and context-window management) do not implement memory: they implement lookup. We argue that treating lookup as memory is a category error with provable consequences for agent capability, long-term learning, and security. Retrie…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 18.2
2026-04-30 · Sudong Wang, Weiquan Huang, Xiaomin Yu, Zuhao Yang, Hehai Lin, Keming Wu, Chaojun Xiao, Chen Chen, Wenxuan Wang, Beier Zhu, Yunjian Zhang, Chengwei Qin
General AI
The standard post-training recipe for large multimodal models (LMMs) applies supervised fine-tuning (SFT) on curated demonstrations followed by reinforcement learning with verifiable rewards (RLVR). However, SFT introduces distributional drift that neither preserves the model's original capabilities nor faithfully matc…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 17.4
2026-04-30 · Qiyao Wang, Haoran Hu, Longze Chen, Hongbo Wang, Hamid Alinejad-Rokny, Yuan Lin, Min Yang
General AI
With the advancement of multimodal large language models (MLLMs) and coding agents, the website development has shifted from manual programming to agent-based project-level code synthesis. Existing benchmarks rely on idealized assumptions, especially for well-structured, information-rich inputs and static execution set…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 17.2
2026-04-30 · Yanting Wang, Chenlong Yin, Ying Chen, Jinyuan Jia
General AI
Long-context large language models (LLMs)-for example, Gemini-3.1-Pro and Qwen-3.5-are widely used to empower many real-world applications, such as retrieval-augmented generation, autonomous agents, and AI assistants. However, security remains a major concern for their widespread deployment, with threats such as prompt…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 16.2
2026-04-30 · Neemias B da Silva, Rodrigo Minetto, Daniel Silver, Thiago H Silva
General AI
Large Language Models (LLMs) are increasingly used as proxies for human perception in urban analysis, yet it remains unclear whether persona prompting produces meaningful and reproducible behavioral diversity. We investigate whether distinct personas influence urban sentiment judgments generated by multimodal LLMs. Usi…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.9
2026-04-29 · Aditya A. Ramesh, Alex Lewandowski, Jürgen Schmidhuber
Research Track A · General AI
Continual learning agents with finite capacity must balance acquiring new knowledge with retaining the old. This requires controlled forgetting of knowledge that is no longer needed, freeing up capacity to learn. Weight decay, viewed as a mechanism for forgetting, can serve this role by gradually discarding information…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.2
2026-04-30 · Xin Zhou, Dingkang Liang, Xiwu Chen, Feiyang Tan, Dingyuan Zhang, Hengshuang Zhao, Xiang Bai
General AI
Driving world models serve as a pivotal technology for autonomous driving by simulating environmental dynamics. However, existing approaches predominantly focus on future scene generation, often overlooking comprehensive 3D scene understanding. Conversely, while Large Language Models (LLMs) demonstrate impressive reaso…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.2
2026-04-30 · Keming Wu, Zuhao Yang, Kaichen Zhang, Shizun Wang, Haowei Zhu, Sicong Leng, Zhongyu Yang, Qijie Wang, Sudong Wang, Ziting Wang, Zili Wang, Hui Zhang, Haonan Wang, Hang Zhou, Yifan Pu, Xingxuan Li, Fangneng Zhan, Bo Li, Lidong Bing, Yuxin Song, Ziwei Liu, Wenhu Chen, Jingdong Wang, Xinchao Wang, Xiaojuan Qi, Shijian Lu, Bin Wang
General AI
Recent visual generation models have made major progress in photorealism, typography, instruction following, and interactive editing, yet they still struggle with spatial reasoning, persistent state, long-horizon consistency, and causal understanding. We argue that the field should move beyond appearance synthesis towa…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 15.2
2026-04-30 · Ivan Bercovich
General AI
Terminal-agent benchmarks have become a primary signal for measuring the coding and system-administration capabilities of large language models. As the market for evaluation environments grows, so does the pressure to ship tasks quickly, often without thorough adversarial review of the verification logic. This paper is…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 13.2
2026-04-30 · Hao Chen, Jiaming Liu, Zhonghao Yan, Nuowei Han, Renrui Zhang, Chenyang Gu, Jialin Gao, Ziyu Guo, Siyuan Qian, Yinxi Wang, Peng Jia, Chi-Wing Fu, Shanghang Zhang, Pheng-Ann Heng
General AI
Vision-Language-Action (VLA) models have increasingly incorporated reasoning mechanisms for complex robotic manipulation. However, existing approaches share a critical limitation: whether employing explicit linguistic reasoning that suffers from latency and discretization, or utilizing more expressive continuous latent…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 13.2
2026-04-30 · Tao Ge, Baolin Peng, Hao Cheng, Jianfeng Gao
General AI
Realistic long-horizon productivity work is strongly conditioned on user-specific computer environments, where much of the work context is stored and organized through directory structures and content-rich artifacts. To scale synthetic data creation for such productivity scenarios, we introduce Synthetic Computers at S…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 12.9
2026-04-29 · Mingze Li, Yu Rong, Songyou Li, Lihong Wang, Jiacheng Cen, Liming Wu, Anyi Li, Zongzhao Li, Qiuliang Liu, Rui Jiao, Tian Bian, Pengju Wang, Hao Sun, Jianfeng Zhang, Ji-Rong Wen, Deli Zhao, Shifeng Jin, Tingyang Xu, Wenbing Huang
General AI
The discovery of novel materials is critical for global energy and quantum technology transitions. While deep learning has fundamentally reshaped this landscape, existing predictive or generative models typically operate in isolation, lacking the autonomous orchestration required to execute the full discovery process. …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.4
2026-04-29 · Karthik Charan Raghunathan, Christian Metzner, Laura Kriener, Melika Payvand
Research Track A · General AI
In a continual learning setting, we require a model to be plastic enough to learn a new task and stable enough to not disturb previously learned capabilities. We argue that this dilemma has an architectural root. A finite network has limited representational and plastic resources, yet the required capacity depends on p…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.4
2026-04-30 · Kathrin Korte, Joachim Winter Pedersen, Eleni Nisioti, Sebastian Risi
Research Track A
To preserve previously learned representations, continual learning systems must strike a balance between plasticity, the ability to acquire new knowledge, and stability. This stability-plasticity dilemma affects how representations can be reused across tasks: shared structure enables transfer when tasks are similar but…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.2
2026-04-30 · Gyoung S. Na, Chanyoung Park
Research Track A · General AI
Deriving governing equations from empirical observations is a longstanding challenge in science. Although artificial intelligence (AI) has demonstrated substantial capabilities in function approximation, the discovery of explainable and extrapolatable equations remains a fundamental limitation of modern AI, posing a ce…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 12.2
2026-04-30 · Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan, Manmohan Chandraker
General AI
Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebound, and material responses seldom match their underlying properties. We present PhyCo, a framework that introduces continuous, interpretable, and physically grounded co…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 11.4
2026-04-29 · Yibin Luo, Shiwei Gao, Huichuan Zheng, Youyou Lu, Jiwu Shu
General AI
Fine-tuning Large Language Models (LLMs) on consumer-grade GPUs is highly cost-effective, yet constrained by limited GPU memory and slow PCIe interconnects. Pipeline parallelism combined with CPU offloading mitigates these hardware bottlenecks by reducing communication overhead. However, existing PP schedules suffer fr…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 11.4
2026-04-30 · Hanzhong Guo, Jie Wu, Jie Liu, Yu Gao, Zilyu Ye, Linxiao Yuan, Xionghui Wang, Yizhou Yu, Weilin Huang
General AI
While Reinforcement Learning from Human Feedback (RLHF) has become a pivotal paradigm for text-to-image generation, its application to image editing remains largely unexplored. A key bottleneck is the lack of a robust general reward model for all editing tasks. Existing edit reward models usually give overall scores wi…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.2
2026-04-30 · Chenxin Li, Zhengyang Tang, Huangxin Lin, Yunlong Lin, Shijue Huang, Shengyuan Liu, Bowen Ye, Rang Li, Lei Li, Benyou Wang, Yixuan Yuan
General AI
LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks freeze a curated task set at release time and grade mainly the final response, making it difficult to evaluate agents against evolving workflow demand or verify whether …
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.2
2026-04-30 · Han Liu, Shanghao Shi, Yevgeniy Vorobeychik, Chongjie Zhang, Ning Zhang
General AI
Low-Rank Adaptation (LoRA), which leverages the insight that model updates typically reside in a low-dimensional space, has significantly improved the training efficiency of Large Language Models (LLMs) by updating neural network layers using low-rank matrices. Since the generation of adversarial examples is an optimiz…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.2
2026-04-30 · Jun Yeon Won, Xin Jin, Shiqing Ma, Zhiqiang Lin
General AI
Large Language Models (LLMs) have achieved remarkable progress in recent years, driving their adoption across a wide range of domains, including computer security. In reverse engineering, LLMs are increasingly applied to critical tasks such as function and variable name recovery and type inference. However, despite the…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 11.2
2026-04-30 · Zainab Rehan, Christian Medeiros Adriano, Sona Ghahremani, Holger Giese
General AI
Rule-based systems remain central in safety-critical domains but often struggle with scalability, brittleness, and goal misspecification. These limitations can lead to reward hacking and failures in formal verification, as AI systems tend to optimize for narrow objectives. In previous research, we developed a neuro-sym…
- Review
- pending
- Role
- unreviewed
- Read
- now
arxiv
Score 10.2
2026-04-30 · Lincan Li, Zheng Chen, Yushun Dong
General AI
Electroencephalogram (EEG) signals are vital for automated seizure detection, but their inherent noise makes robust representation learning challenging. Existing graph construction methods, whether correlation-based or learning-based, often generate redundant or irrelevant edges due to the noisy nature of EEG data. Thi…
- Review
- pending
- Role
- unreviewed
- Read
- now
huggingface
Score 9.4
2026-04-29 · Zhen Zhang, Changyi Yang, Zijie Xia, Zhen Yang, Chengzhi Liu, Zhaotiao Weng, Yepeng Liu, Haobo Chen, Jin Pan, Chenyang Zhao, Yuheng Bu, Alkesh Patel, Zhe Gan, Xin Eric Wang
General AI
Token serves as the fundamental unit of computation in modern autoregressive models, and generation length directly influences both inference cost and reasoning performance. Despite its importance, existing approaches lack fine-grained length modeling, operating primarily at the coarse-grained sequence level. We introd…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-04-30 · Anietta Weckauff, Yuchen Zhang, Maksym Andriushchenko
General AI
Fine-tuning large language models (LLMs) on narrowly misaligned data generalizes to broadly misaligned behavior, a phenomenon termed emergent misalignment (EM). While prior work has found a correlation between harmful behavior and self-assessment in emergently misaligned models, it remains unclear how consistent this c…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-04-30 · Yujun Wu, Dongxu Zhang, Xinchen Li, Jinhang Xu, Yiling Duan, Yumou Liu, Jiabao Pan, Xuanhe Zhou, Jingxuan Wei, Siyuan Li, Jintao Chen, Conghui He, Cheng Tan
General AI
Existing research infrastructure is fundamentally document-centric, providing citation links between papers but lacking explicit representations of methodological evolution. In particular, it does not capture the structured relationships that explain how and why research methods emerge, adapt, and build upon one anothe…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-04-30 · Jeanne Monnier, Thomas George, Frédéric Guyard, Christèle Tarnec, Marios Kountouris
General AI
Fairness in machine learning remains challenging due to its ethical complexity, the absence of a universal definition, and the need for context-specific bias metrics. Existing methods still struggle with intersectionality, multiclass settings, and limited flexibility and generality. To address these gaps, we introduce …
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-04-30 · Zeynep Okray, Nils Otto, Anna A. Cook, Clifford Talbot, Ashwin Miriyala, Martín Klappenbach, Ciara Stern, Kieran Desmond, Paola Vargas-Gutierrez, Scott Waddell
General AI
Associating multiple sensory cues with a single experience or object is a fundamental process that improves object recognition and memory performance. However, neural mechanisms that bind sensory features during learning and augment memory expression are unknown. Here we demonstrate multisensory appetitive and aversive…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 8.2
2026-04-30 · Samuel Kiegeland, Vésteinn Snæbjarnarson, Tim Vieira, Ryan Cotterell
General AI
Surprisal theory links human processing effort to the predictability of an upcoming linguistic unit, but empirical work often leaves the notion of a unit underspecified. In practice, experimental stimuli are segmented into linguistically motivated units (e.g., words), while pretrained language models assign probability…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.2
2026-04-30 · Himanshu Pandey, Ratikanta Behera
General AI
In recent years, physics-informed neural networks (PINNs) have gained significant attention for solving differential equations, although they suffer from two fundamental limitations, namely, spectral bias inherent in neural networks and loss imbalance arising from multiscale phenomena. This paper proposes an adaptive w…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 7.2
2026-04-30 · Vinayak Gupta, Chih-Hao Lin, Shenlong Wang, Anand Bhattad, Jia-Bin Huang
General AI
Reconstructing 3D scenes from sparse, unposed images remains challenging under real-world conditions with varying illumination and transient occlusions. Existing methods rely on scene-specific optimization using appearance embeddings or dynamic masks, which requires extensive per-scene training and fails under sparse v…
- Review
- pending
- Role
- unreviewed
- Read
- soon
huggingface
Score 6.4
2026-04-29 · Naibin Gu, Chenxu Yang, Qingyi Si, Chuanyu Qin, Dingyu Yao, Peng Fu, Zheng Lin, Weiping Wang, Nan Duan, Jiaqi Wang
General AI
RLVR and OPD have become standard paradigms for post-training. We provide a unified analysis of these two paradigms in consolidating multiple expert capabilities into a single model, identifying capability loss in different ways: mixed RLVR suffers from inter-capability divergence cost, while the pipeline of first trai…
- Review
- pending
- Role
- unreviewed
- Read
- later
huggingface
Score 6.4
2026-04-29 · Jiachen Liu, Jiaxin Pei, Jintao Huang, Chenglei Si, Ao Qu, Xiangru Tang, Runyu Lu, Lichang Chen, Xiaoyan Bai, Haizhong Zheng, Carl Chen, Zhiyang Chen, Haojie Ye, Yujuan Fu, Zexue He, Zijian Jin, Zhenyu Zhang, Shangquan Sun, Maestro Harmon, John Dianzhuo Wang, Jianqiao Zeng, Jiachen Sun, Mingyuan Wu, Baoyu Zhou, Chenyu You, Shijian Lu, Yiming Qiu, Fan Lai, Yuan Yuan, Yao Li, Junyuan Hong, Ruihao Zhu, Beidi Chen, Alex Pentland, Ang Chen, Mosharaf Chowdhury, Zechen Zhang
General AI
Scientific publication compresses a branching, iterative research process into a linear narrative, discarding the majority of what was discovered along the way. This compilation imposes two structural costs: a Storytelling Tax, where failed experiments, rejected hypotheses, and the branching exploration process are dis…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 6.2
2026-04-30 · Maykon Nunes, Emanuel Coutinho, Carla Bezerra, Ivan Machado
General AI
Angular is one of the most widely adopted frameworks for developing large-scale, dynamic web applications. As projects increase in scope and complexity, developers face growing challenges in managing architecture and maintaining clean, modular code. These challenges often lead to design flaws, commonly referred to as c…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 6.2
2026-04-30 · Tianyuan Wu, Chaokun Chang, Lunxi Cao, Wei Gao, Wei Wang
General AI
Autonomous agents act through sandboxed containers and microVMs whose state spans filesystems, processes, and runtime artifacts. Checkpoint and restore (C/R) of this state is needed for fault tolerance, spot execution, RL rollout branching, and safe rollback-yet existing approaches fall into two extremes: application-l…
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 6.2
2026-04-30 · Kehong Gong, Zhengyu Wen, Dao Thien Phong, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Guanli Hou, Dongze Lian, Xiaoyu He, Mingyuan Zhang, Hanwang Zhang
General AI
Recent methods for arbitrary-skeleton motion capture from monocular video follow a factorized pipeline, where a Video-to-Pose network predicts joint positions and an analytical inverse-kinematics (IK) stage recovers joint rotations. While effective, this design is inherently limited, since joint positions do not fully …
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 6.2
2026-04-30 · Junyoung Lee, Sookwan Han, Jeonghwan Kim, Inhee Lee, Mingi Choi, Jisoo Kim, Wonjung Woo, Hanbyul Joo
General AI
Human-robot collaboration has been studied primarily in dyadic or sequential settings. However, real homes require multiadic collaboration, where multiple humans and robots share a workspace, acting concurrently on interleaved subtasks with tight spatial and temporal coupling. This regime remains underexplored because …
- Review
- pending
- Role
- unreviewed
- Read
- soon
arxiv
Score 5.2
2026-04-30 · Lautaro Giordano, Sebastian Gonçalves, José Roberto Iglesias, María Fabiana Laguna
General AI
We present a minimal agent-based model of interacting agents characterized by their wealth to study taxation and inequality in a non-conservative economy. Wealth evolves through an extremal stochastic replacement process in which the poorest agent has its wealth replaced by a new random value, financed through a collec…
- Review
- pending
- Role
- unreviewed
- Read
- later
arxiv
Score 5.2
2026-04-30 · Vishnuprasadh Kumaravelu, Sunil Gupta, P. K. Srijith
General AI
Exponential growth in the scale of modern foundation models has led to the widespread adoption of Low-Rank Adaptation (LoRA) as a parameter-efficient fine-tuning technique. However, standard LoRA implementations disregard the varying intrinsic dimensionality of model layers and enforce a uniform rank, leading to parame…
- Review
- pending
- Role
- unreviewed
- Read
- later