Wei Yao
Hi there, welcome! I am currently a fourth-year Ph.D. student at the Gaoling School of Artificial Intelligence, Renmin University of China. I am truly honored to be advised by Prof. Yong Liu. Now I’m a visiting student at the National University of Singapore, collaborating with Prof. Yunbei Xu. I am expected to graduate in 2027 and am open to research-oriented opportunities in academia and industry, including postdoctoral and research scientist positions.
From October 2023 to March 2024, as a research intern at Shanghai AI Laboratory, I was fortunate to work under the guidance of Prof. Jing Shao. Prior to my Ph.D. studies, I earned my Bachelor of Engineering in Software Engineering from Huazhong University of Science and Technology in June 2022. I was fortunate to be advised by Prof. Kun He. During my undergraduate studies, I was honored to receive the National Scholarship (2019), a recognition that motivated me to pursue further research in AI.
Research Interests
I believe that as AI systems continue to become more capable, a central challenge in the coming years will be how to supervise AI without ground truth. Motivated by this perspective, my research focuses on how LLMs can learn under imperfect supervision, including weak supervision, weak-to-strong generalization, and preference-based alignment. I aim to study these questions across different stages of model development, from pre-training to fine-tuning, and to provide insights from both theoretical analysis and empirical validation.
My recent work studies how strong models can learn reliably from weak, noisy, or limited supervision, as demonstrated by our work in ICML 2026, ACL25 and several upcoming preprints (such as arXiv:2605.05710). My previous research focused on trustworthy AI, including fairness, robustness and interpretability, with publications at ICML25, TMLR24, ACL24, CVPR23.
Preprints
(* indicates equal contribution, # indicates corresponding authors)
On the Blessing of Pre-training in Weak-to-Strong Generalization
Wei Yao, Wang Zhaoyang, Gengze Xu, Chen Qian, Dongrui Liu, Ziqiao Wang, Yong Liu#, Yunbei Xu#
arXiv preprint arXiv:2605.05710
On Weak-to-Strong Generalization and f-Divergence
Wei Yao*, Gengze Xu*, Huayi Tang, Wenkai Yang, Donglin Di, Ziqiao Wang, Yong Liu#
arXiv preprint arXiv:2506.03109
Selected Publications
(* indicates equal contribution, # indicates corresponding authors)
On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective
Gengze Xu*, Wei Yao*, Ziqiao Wang#, Yong Liu#
ICML 2026
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
Wei Yao*, Wenkai Yang*, Ziqiao Wang, Yankai Lin, Yong Liu#
ACL 2025 (Findings)
Understanding Model Ensemble in Transferable Adversarial Attack
Wei Yao*, Zeliang Zhang*, Huayi Tang, Yong Liu#
ICML 2025
Understanding Fairness Surrogate Functions in Algorithmic Fairness
Wei Yao*, Zhanke Zhou*, Zhicong Li, Bo Han, Yong Liu#
TMLR 2024 (Presented at ICLR 2025, Journal-to-Conference Track)
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models
Chen Qian*, Jie Zhang*, Wei Yao*, Dongrui Liu, Zhenfei Yin, Yu Qiao, Yong Liu#, Jing Shao#
ACL 2024 (Findings)
Fair Scratch Tickets: Finding Fair Sparse Networks without Weight Training
Pengwei Tang*, Wei Yao*, Zhicong Li, Yong Liu#
CVPR 2023
Service
Reviewer: ICML, NeurIPS, ICLR, ACL, CVPR.
