Mingze Wang
About me
I am a third-year Ph.D student in Computational Mathematics, School of Mathematical Sciences, Peking University (2021-Present). I am very fortunate to be advised by Prof. Weinan E. Prior to that, I received my B.S. degree in Pure and Applied Mathematics (ranking 1/111 for the first three years during my undergraduate study) from School of Mathematical Sciences, Zhejiang University, Hangzhou, China in 2021.
Please feel free to drop me an email if you are interested in collaborating with me.
I am now an Algorithm Intern at a start-up AGI company in Hangzhou (2023.12-now).
News
[2023.05] One papers accepted to ICML 2024!
[2023.12] I am now an Algorithm Intern at a start-up AGI company in Hangzhou, China.
[2023.11] I won the 2023 BICMR Mathematical Award for Graduate Students (Top 1%, 110,000 RMB)!
[2023.09] One paper accepted to NeurIPS 2023 as a Spotlight (Top 3.5%)!
[2022.11] I have passed the Ph.D. qualifying exam!
[2022.10] I won the 2022 PKU Academic Innovation Award (Top 1%)!
[2022.09] Two papers accepted to NeurIPS 2022!
Research Interests
I am broadly interested in theory, algorithm and application of machine learning. I am also interested in non-convex and convex optimization.
Recently, I am also dedicated to to use theory to design algorithms elegantly.
Specifically, my recent research topics are
Deep Learning Theory: optimization, generalization, implicit bias, and approximation. [1][2][3][4][5][6][7][8]
Transformer and Large Language Model: theory and algorithm. [7] (On the preparation)
Non-convex and Convex Optimization: theory and algorithm. [1][4][6]
CV and NLP: algorithm and application. (On the preparation)
Recent Publications
[6] Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang, Zeping Min, Lei Wu
2024 International Conference on Machine Learning, (ICML 2024) 1-38.
[4] Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks
Mingze Wang, Chao Ma
2023 Conference on Neural Information Processing Systems (NeurIPS 2023) (Spotlight, Top 3.5%), 1-94.
[5] A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang, Lei Wu
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (NeurIPS 2023 - M3L), 1-30.
[1] Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma
2022 Conference on Neural Information Processing Systems (NeurIPS 2022), 1-73.
[2] The alignment property of SGD noise and how it helps select flat minima: A stability analysis
Lei Wu, Mingze Wang, Weijie J. Su
2022 Conference on Neural Information Processing Systems (NeurIPS 2022), 1-25.
Recent Preprints
[7] Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
Mingze Wang, Weinan E
arXiv preprint, 1-65, Feb 2024.
[8] The Implicit Bias of Gradient Noise: a Symmetry Perspective
Liu Ziyin, Mingze Wang, Lei Wu
arXiv preprint, 1-17, Feb 2024.
[3] Generalization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma
arXiv preprint, 1-32, June 2022.
|