Details

SMS Spam Detection Using Machine Learning Approach

Anikait Kapoor

Department of Computer Science and Engineering, Apex Institute of Technology, Chandigarh University, Mohali, Punjab, India

Debavushan Saikia

Department of Computer Science and Engineering, Apex Institute of Technology, Chandigarh University, Mohali, Punjab, India

Ishaan Dhawan

Department of Computer Science and Engineering, Apex Institute of Technology, Chandigarh University, Mohali, Punjab, India

10-17

Vol: 14, Issue: 1, 2024

Receiving Date: 2023-11-18 Acceptance Date:

2024-01-29

Publication Date:

2024-02-07

Download PDF

http://doi.org/10.37648/ijrst.v14i01.002

Abstract

With the rise in mobile awareness in recent years, the short message service (SMS) industry has generated billions of dollars in revenue. However, this has led to an increase in unwanted commercial advertising or spam sent to regular phones, with parts of Asia having up to 30% of content messages as spam in 2012. One of the challenges in SMS spam filtering, it requires a comprehensive database and the limited usefulness and dialect used in SMS. In this extension, analysts used a real SMS spam database from the UCI Machine Learning store and connected different machine learning methods after preprocessing and extracting markup. The results were compared and the main spam filtering algorithms for the message body were distinguished. The final reconstruction using 10-fold cross-validation appeared to have the primary classifier more than halve the overall error rate compared to the best proof in a paper.

Keywords: supervised learning; classification algorithms; feature engineering; natural language processing (NLP); text classification

References

  1. Press Release, Growth Accelerates in the Worldwide Mobile Phone and Smartphone Markets in the Second Quarter, According to IDC, http: //www.idc.com/getdoc.jsp?containerId=prUS24239313
  2. Tiago A. Almeida, Jos Mara G. Hidalgo, and Akebo Yamakami. 2011. Contributions to the study of SMS spam filtering: new collection and results. In Proceedings of the 11th ACM symposium on Document engineering (DocEng ’11). ACM, New York, NY, USA, 259-262. DOI=10.1145/2034691.2034742 http://doi.acm.org/10.1145/2034691.2034742
  3. http://en.wikipedia.org/wiki/Short Message Service
  4. http://en.wikipedia.org/wiki/Mobile phone spam
  5. Adaboost, http://en.wikipedia.org/wiki/AdaBoost
  6. SMS Spam Collection Data Set from UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection
  7. Scikit-learn Ensemble Documentation, http://scikit-learn.org/stable/ modules/ensemble.html
  8. T. G. Dietterich. Ensemble methods in machine learning. In J. Kittler and F. Roli, editors, Multiple Classifier Systems, pages 1-15. LNCS Vol. 1857, Springer, 2001.
  9. SMS Spam Collection v.1, http://www.dt.fee.unicamp.br/∼tiago/ smsspamcollection
Back

Disclaimer: All papers published in IJRST will be indexed on Google Search Engine as per their policy.

We are one of the best in the field of watches and we take care of the needs of our customers and produce replica watches of very good quality as per their demands.