Zhang Zhongwang publishes a paper in TPAMI: uncovering how complexity control enhances transformer generalization

2026-03-18 93

Zhang Zhongwang, a Zhiyuan graduate of the Mathematics division (class of 2021), has published a research paper as co-first author in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), a leading journal in artificial intelligence.

The paper, "Complexity Control Facilitates Reasoning-Based Compositional Generalization in Transformers," was completed by the Deep Learning Theory Team at SJTU's Institute of Natural Sciences and School of Mathematical Sciences. The study reveals critical mechanisms underlying how Transformer models achieve genuine compositional generalization—the ability to combine learned concepts into novel complex scenarios.

Through carefully designed experiments, the team identified that initialization scale fundamentally determines whether models learn to generalize or simply memorize. Small initialization promotes low-complexity "generalization solutions" where networks learn underlying functional structures, while conventional initialization leads to high-complexity memorization. The mechanism behind this lies in "condensation phenomena"—under small initialization, effective neuron count becomes far smaller than actual count, naturally enforcing simplicity while maintaining performance. The findings extend to common regularization methods, establishing a unified theoretical framework for complexity control validated across multiple tasks.

The research carries deep Zhiyuan roots—both corresponding authors are the college's inaugural graduates (Class of 2012): Professor Xu Zhiqin and Professor Zhang Yaoyun from SJTU's Institute of Natural Sciences. Zhang Zhongwang began his research journey as an undergraduate under Professor Xu's mentorship, with Xu's interdisciplinary background in physics and computational biology shaping his approach to AI interpretability. Zhang credits Zhiyuan College's integration of cutting-edge research into undergraduate education for cultivating his scientific perspective.

Zhang's research excellence earned him the National Scholarship for Doctoral Students. The study received support from National Key R&D Program, National Natural Science Foundation of China, and computational resources from SJTU's high-performance computing centers.