Wei Yao

Hi there, welcome! I am currently a fourth-year Ph.D. student at the Gaoling School of Artificial Intelligence, Renmin University of China. I am truly honored to be advised by Prof. Yong Liu. Starting late this year, I will be a visiting student at the National University of Singapore, collaborating with Prof. Yunbei Xu.

From October 2023 to March 2024, as a research intern at Shanghai AI Laboratory, I was fortunate to work under the guidance of Prof. Jing Shao. Prior to my Ph.D. studies, I earned my Bachelor of Engineering in Software Engineering from Huazhong University of Science and Technology in June 2022. I’m very fortunate to be advised by Prof. Kun He. During my undergraduate studies, I was honored to receive the National Scholarship (2019), a recognition that motivated me to pursue further research in AI.

Research Interests

My previous research focused on trustworthy AI, including fairness, robustness and interpretability: ICML25, TMLR24, ACL24, CVPR23.

I’m currently focused on superalignment, particularly the area of weak-to-strong generalization, as demonstrated by our work in ACL25 and several upcoming preprints (available on Google Scholar).

Selected Publications

(* indicates equal contribution, # indicates corresponding authors)

Service

Reviewer: NeurIPS, ICLR, AISTATS, TMLR, AAAI, ACL, EMNLP, CVPR

Wei Yao

Wei Yao

Research Interests

Selected Publications

Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL

Understanding Model Ensemble in Transferable Adversarial Attack

Understanding Fairness Surrogate Functions in Algorithmic Fairness

Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models

Fair Scratch Tickets: Finding Fair Sparse Networks without Weight Training

Service