Mingze Wang
About me
I am a fourth-year Ph.D candidate in Computational Mathematics, School of Mathematical Sciences, Peking University (2021-Present). I am very fortunate to be advised by Prof. Weinan E. Prior to that, I received my B.S. degree in Pure and Applied Mathematics (ranking 1/111 for the first three years during my undergraduate study) from School of Mathematical Sciences, Zhejiang University, Hangzhou, China in 2021.
Please feel free to drop me an email if you are interested in collaborating with me.
News
[2024.09] I won the 2024 China National Scholarship (top 2%)!
[2024.09] Three papers accepted to NeurIPS 2024!
[2024.05] One paper accepted to ICML 2024! One paper accepted to ACL 2024!
[2023.11] I won the 2023 BICMR Mathematical Award for Graduate Students (top 1%)!
[2023.09] One paper accepted to NeurIPS 2023 as a Spotlight (top 3.5%)!
[2022.11] I have passed the Ph.D. qualifying exam!
[2022.10] I won the 2022 PKU Academic Innovation Award (top 1%)!
[2022.09] Two papers accepted to NeurIPS 2022!
Research Interests
I am broadly interested in theory, algorithm and application of machine learning. I am also interested in non-convex and convex optimization.
Recently, I am dedicated to use theory to design algorithms elegantly.
Specifically, my recent research topics are
Deep Learning Theory: optimization, generalization, implicit bias, and expressivity. [1][2][3][4][5][6][8][9][10][11][12]
Transformer and Large Language Model: theory and algorithm. [8][10][12]
Non-convex and Convex Optimization: theory and algorithm. [2][4][6][10][11][12]
CV and NLP: algorithm and application. [7]
Recent Publications
[10] Improving Generalization and Convergence by Enhancing Implicit Regularization
Mingze Wang, Jinbo Wang, Haotian He, Zilin Wang, Guanhua Huang, Feiyu Xiong, Zhiyu Li, Weinan E, Lei Wu
2024 Conference on Neural Information Processing Systems (NeurIPS 2024), 1-35.
[9] Loss Symmetry and Noise Equilibrium of Stochastic Gradient Descent
Liu Ziyin, Mingze Wang, Hongchao Li, Lei Wu
2024 Conference on Neural Information Processing Systems (NeurIPS 2024), 1-26.
[8] Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
Mingze Wang, Weinan E
2024 Conference on Neural Information Processing Systems (NeurIPS 2024), 1-70.
[7] Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Guanhua Huang, Yuchen Zhang, Zhe Li, Yongjian You, Mingze Wang, Zhouwang Yang
2024 Annual Meeting of the Association for Computational Linguistics (ACL 2024), 1-20.
[6] Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang, Zeping Min, Lei Wu
2024 International Conference on Machine Learning (ICML 2024), 1-38.
[5] A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang, Lei Wu
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (NeurIPS 2023 - M3L), 1-30.
[4] Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks
Mingze Wang, Chao Ma
2023 Conference on Neural Information Processing Systems (NeurIPS 2023) (Spotlight, top 3.5%), 1-94.
[3] The alignment property of SGD noise and how it helps select flat minima: A stability analysis
Lei Wu, Mingze Wang, Weijie J. Su
2022 Conference on Neural Information Processing Systems (NeurIPS 2022), 1-25.
[2] Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma
2022 Conference on Neural Information Processing Systems (NeurIPS 2022), 1-73.
Recent Preprints
* indicates equal contribution.
[12] How Transformers Implement Induction Heads: Approximation and Optimization Analysis
Mingze Wang*, Ruoxi Yu*, Weinan E, Lei Wu.
arXiv preprint, 1-39. Oct 2024.
[11] Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou*, Mingze Wang*, Yuchen Mao, Bingrui Li, Junchi Yan.
arXiv preprint, 1-24. Oct 2024.
[1] Generalization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma
arXiv preprint, 1-32, June 2022.
Selected Awards and Honours
|