Automated machine learning (AutoML) has become a new trend which is the process of automating the complete pipeline from the raw dataset to the development of machine learning model. It not only can relief data scientists’ works but also allows non-experts to finish the jobs without solid knowledge and understanding of statistical inference and machine learning.
One limitation of AutoML framework is the data quality differs significantly batch by batch. Consequently, fitted model quality for some batches of data can be very poor due to distribution shift for some numerical predictors. In this dissertation, we develop an intelligent binning to resolve this problem. In addition, various regularized regression classifiers (RRCs) including Ridge, Lasso and Elastic Net regression have been tested to enhance model performance further after binning.
We focus on the binary classification problem and had developed a AutoML framework using Python to handle the entire data preparation process including data partition and intelligent binning. This system has been tested extensively and the results have shown that (1) All the models perform better with intelligent binning for both balanced and imbalance binary classification problem. (2) Regression-based methods are more sensitive than tree-based methods using intelligent binning. RRCs can work better than other tree methods by using intelligent binning technique. (3) Weighted RRC can obtain the best results comparing other methods. (4) Our framework is an effective and reliable tool to conduct AutoML.
Key Words: AutoML, Statistical Learning, Machine Learning, Binning, and Regularized Regression Classifiers.
Outline of Studies: PhD in Big Data Analytics,
Educational Career:
Ph.D. in Engineering Mechanics, 2011, University of Nebraska-Lincoln
M.S. in Statistical Computing, Data Mining Track 2013, University of Central Florida
Committee in Charge:
Dr. Chung-Ching Morgan Wang, Chair
Dr. Liqiang Ni
Dr. Rui Xie
Dr. Bruce Cauklins
Approved for distribution by Dr. Chung-Ching Morgan Wang Committee Chair, on March 21, 2023.
Read More