Speaker

Ying Wen

Time

2025.03.19 16:00-17:30

Abstract

The enhancement of Large Language Models (LLMs) relies on continuous access to high-quality data and feedback signals. While the pre-training phase has already leveraged vast amounts of high-quality data, sustained growth depends critically on the ongoing integration of new, high-quality data. As human-generated data production is costly and struggles to meet demand, exploring methods for LLMs to autonomously generate and curate data through self-iteration has become imperative. This talk will examine the data reproduction process in LLMs, encompassing three key stages: generation, evaluation, and training. The core challenges lie in designing efficient algorithms and feedback utilization mechanisms to enable effective data curation and assessment. By applying feedback signals through reinforcement learning, the approach ensures only the most valuable data is used for iterative model training while enhancing performance in complex reasoning and decision-making tasks.

Bio

Ying Wen is a tenure-track associate professor at the School of Artificial Intelligence, Shanghai Jiao Tong University. His research focuses on reinforcement learning, multi-agent systems, and decision-making large models. He obtained his Ph.D. (2020) and master of research degree (2016) from the Department of Computer Science at University College London (UCL), UK. He has been selected for the Shanghai High-Level Overseas Talent Program and serves as the principal investigator for projects including a National Key R&D Program project and the Shanghai Young Scientific and Technological Talents Sailing Program. His research output includes over 40 publications in top-tier international conferences such as ICML, NeurIPS, ICLR, IJCAI, and AAMAS, with honors including the Best System Paper Award at CoRL 2020 and the Blue Sky Track Best Paper Award at AAMAS 2021.