Research
Research Interests
I am broadly interested in theory, algorithm and application of machine learning. I am also interested in non-convex and convex optimization. Recently, I am also dedicated to to use theory to design algorithms elegantly. Specifically, my recent research topics are
Deep Learning Theory: optimization, generalization, implicit bias, and approximation.
Optimization: When training neural networks, why can optimization algorithms converge to global minima? [1][4]
Implicit Bias: When training neural networks, why can optimization algorithms converge to global minima with favorable generalization ability (even without any explicit regularization)? Such as flat-minima-bias [2][5] and max-margin-bias aspects [4][6].
Algorithm Design: For machine learning problems, design new optimization algorithms which can converge to global minima with better generalization ability. [6]
Approximation: Exploring the expressive power of deep neural networks through the lens of approximation theory. [7]
Generalization: How to measure the generalization ability of neural networks. [3]
Transformer and Large Language Model: theory and algorithm.
Non-convex and Convex Optimization: theory and algorithm.
CV and NLP: algorithm and application. [9]
Recent Publications
[6] Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang, Zeping Min, Lei Wu
2024 International Conference on Machine Learning, (ICML 2024) 1-38.
[9] Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
Guanhua Huang, Yuchen Zhang, Zhe Li, Yongjian You, Mingze Wang, Zhouwang Yang
2024 Annual Meeting of the Association for Computational Linguistics, (ACL 2024)
[4] Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks
Mingze Wang, Chao Ma
2023 Conference on Neural Information Processing Systems (NeurIPS 2023) (Spotlight, Top 3.5%), 1-94.
[5] A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang, Lei Wu
NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (NeurIPS 2023 - M3L), 1-30.
[1] Early Stage Convergence and Global Convergence of Training Mildly Parameterized Neural Networks
Mingze Wang, Chao Ma
2022 Conference on Neural Information Processing Systems (NeurIPS 2022), 1-73.
[2] The alignment property of SGD noise and how it helps select flat minima: A stability analysis
Lei Wu, Mingze Wang, Weijie J. Su
2022 Conference on Neural Information Processing Systems (NeurIPS 2022), 1-25.
Recent Preprints
[7] Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
Mingze Wang, Weinan E
arXiv preprint, 1-65, Feb 2024.
[8] The Implicit Bias of Gradient Noise: a Symmetry Perspective
Liu Ziyin, Mingze Wang, Lei Wu
arXiv preprint, 1-17, Feb 2024.
[3] Generalization Error Bounds for Deep Neural Networks Trained by SGD
Mingze Wang, Chao Ma
arXiv preprint, 1-32, June 2022.
Co-authors
|