Our Colloquium series offers a diverse platform for research scholars, faculty, students, and industry experts to share and exchange ideas, fostering discussion and networking across mathematics, statistics, and data science.
Professor Yuesheng Xu from Old Dominion University will speak at this week's colloquium on "Multi-Grade Deep Learning."
Abstract: The remarkable success of deep learning is widely acknowledged, yet its training process remains largely a black box. Standard deep learning follows a single-grade approach, training a deep neural network (DNN) end-to-end by solving one large, nonconvex optimization problem. As network depth increases, this becomes computationally demanding, as all weight matrices and biases must be learned simultaneously.
Inspired by the human education system, we propose multi-grade deep learning (MGDL), which structures training into successive grades. Rather than tackling a single large-scale problem, MGDL decomposes learning into a sequence of smaller problems, each associated with a grade. At each grade, a shallow neural network learns to approximate the residual from previous grades, with its output being used as input of the next grade. This hierarchical strategy mitigates nonconvexity, making training more efficient and stable.
The resulting model has a stair-shaped architecture, formed by superimposing the networks from all grades. MGDL naturally supports adaptive learning, adding new grades if the residual error exceeds a set tolerance. We provide theoretical guarantees in function approximation, proving that if the network learned at a grade is nontrivial, the optimal error strictly decreases from the previous grade. Numerical experiments further show that MGDL substantially outperforms the conventional single-grade approach in both accuracy and stability.
Speaker Bio: Dr. Yuesheng Xu received his B.S. and M.S. degrees from Sun Yat-sen University, Guangdong, China, in 1982 and 1985, respectively, and his Ph.D. from Old Dominion University, Norfolk, VA, USA, in 1989. He was a Humboldt Research Fellow at RWTH Aachen University, Germany, from 1996 to 1997. He has held several distinguished academic positions, including Eberly Chair Professor of Mathematics at West Virginia University (2001–2003), Professor of Mathematics at Syracuse University (2003–2013), and Guohua Chair Professor of Mathematics at Sun Yat-sen University (2009–2017). He is currently a Professor of Data Science and Mathematics at Old Dominion University. His research, supported by NSF, NASA, DoD, and NIH, spans numerical analysis, applied harmonic analysis, image and signal processing, medical imaging, and machine learning. Notable contributions include developing multiscale methods for solving Fredholm integral equations, establishing the universal kernel theorem, and introducing the concept of reproducing kernel Banach spaces, which provides a mathematical foundation for deep learning. He has served as editor or associate editor for numerous mathematical journals, including as Managing Editor of Advances in Computational Mathematics (Springer) from 1999 to 2012.
Read More