Arnav Goenka
Vellore Institute of Technology, Vellore, Tamil Nadu, India
Download PDFhttp://doi.org/10.37648/ijrst.v13i03.014
Data science projects typically involve a machine learning (ML) process characterized by evolving data, code, and models. For instance, as datasets grow in size, they may become suitable for ML models that require larger datasets. However, the dynamic factors influencing model selection must be better understood and explicitly represented. This paper introduces ongoing work on an adaptive method for ML model selection in big data science projects. The proposed method includes (i) identifying the factors that influence model selection based on heuristics from the literature and (ii) modelling the variability of these factors using a feature diagram and constraints that trigger adaptive reconfiguration—changes in model selection due to shifts in these factors. The method's applicability is demonstrated through an illustrative use case. By providing a clearer understanding of the dynamic factors that influence model selection, this method shows how these factors can be explicitly represented and automated. This enhanced understanding can lead to a more explicit, efficient, adaptive, and explainable model selection process, ultimately laying the groundwork for developing novel dynamic software product lines to support this process.
Keywords: data science; machine learning; big data
Disclaimer: All papers published in IJRST will be indexed on Google Search Engine as per their policy.