Hi there! I am an undergraduate from Shanghai Jiao Tong University, majoring in Artificial Intelligence. My research interests include, but are not limited to, Multi-modal learning.

🔥 News

2026.01: 🎉🎉 One paper is accepted by ICLR 2026!
2025.01: 🎉🎉 One paper is accepted by ICLR 2025!

📝 Publications

ICLR 2026

Beyond Static Vision: Scene Dynamic Field Unlocks Intuitive Physics Understanding in Multi-modal Large Language Models

Nanxi Li, Xiang Wang, Yuanjie Chen, Haode Zhang, Hong Li, Yong-Lu Li

[Project website] [Paper]

While Multimodal Large Language Models (MLLMs) excel at general understanding, they struggle with high-level physics reasoning, particularly regarding intuitive physics and continuum dynamics. To address this, we introduce two benchmark tasks: Next Frame Selection (NFS) and Temporal Coherence Verification (TCV). Experiments show that state-of-the-art models perform poorly on these tasks. We propose Scene Dynamic Field (SDF), a multi-task fine-tuning framework that integrates physics simulators. SDF significantly boosts performance—achieving major gains in fluid tasks—and demonstrates strong generalization to unseen domains.

ICLR 2025

The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

Hong Li, Nanxi Li, Yuanjie Chen, Jianbin Zhu, Qinlu Guo, Cewu Lu, Yong-Lu Li

[Project website] [Paper]

In this paper, we first devise a standard association benchmark based on adjective and verb association semantic concepts. Instead of costly data annotation and organization, we propose a convenient annotation-free reconstruction method transforming the general dataset for our association tasks. Furthermore, we comprehensively investigate the MLLMs’s ability and potential for association ability.

📖 Educations

2022.09 - now, Shanghai Jiao Tong University