Tanvi S Hungund
Senior Manager, Dallas TX California State University Fullerton
Download PDFhttp://doi.org/10.37648/ijrst.v14i01.004
Apache Spark, renowned for its proficiency in processing vast datasets, efficiently handles intricate processing tasks. It disperses these tasks across numerous computing instances autonomously or in conjunction with other distributed computing tools. As the volume of data burgeons and machine learning models advance, the imperative for swift and intricate feature engineering and model training intensifies. Clusters comprising multiple compute instances exhibit a noteworthy performance surge compared to individual cases, expediting data processing. However, leveraging such cluster configurations entails substantial costs due to the amalgamation of multiple compute instances (Worker Nodes) overseen by a Controller Node.
Keywords: Apache Spark; Artificial Intelligence; recommender systems
Disclaimer: All papers published in IJRST will be indexed on Google Search Engine as per their policy.