Tan Kian Aun
Aznul Qalid Md Sabri
Fylix
The growth of machine learning application usage has shown a great growth of needy
to implement machine learning in every industry. However, there are lots of effort and
cost to handle and maintain a variety of model for different purposes. Therefore, data
scientist and AI expert has been working to automate partially or whole machine learning
process. Industry like finance has been growing fast everyday. Machine learning models
has to be retrained once a while to kept updated with the latest set of data. Hence, this
project proposed the use of TPOT library to automated the whole machine learning
pipeline. TPOT automation includes data imputation, feature selection, feature
preprocessing, feature construction, model selection and parameter optimization. This
project aims to help the company “Fylix” in study and evaluate how well the auto-ml
library can perform in financial related dataset. From several benchmarking papers
review, I found that TPOT are able to achieve better good accuracy in regression and
classification compared to most of the automated machine learning library of framework
available out there. Public financial related dataset has been used to evaluate the
performance for regression and classification and in results it brings good evaluation
scores and prediction. Experiment of TPOT training with null has been carried out to test
and compare the results with dataset that is not null. Found that the performance quite
good and almost similar to the training accuracy of not null dataset. The finalized model
will be used for deployment purpose in a small demo system for end-user.
Keywords: automated machine learning, TPOT, financial, regression, classification