A Comparative Analysis of Machine Learning Models in News Categorization

Main Article Content

Mohammad Hossein Zolfagharnasab
Siavash Damari

Abstract

The constant stream of news nowadays highlights the necessity for meticulous assessment to ensure that the information accurately reaches its intended audience with the least amount of delay least delay. Despite the flexibility and efficiency of Deep Learning (DL) models, their intricate training and substantial resource demands pose significant challenges for their deployment in real-time applications. In this regard, this study evaluates the performance of resource-efficient Machine Learning (ML) techniques – Multinomial Naive Bayes (MNB), Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression (LR) – in categorizing news. Based on the results, all the evaluated models attain a commendable level of accuracy in news categorization. Notably, the SVM excels, achieving an accuracy rate of 98% and a mean squared error of 0.28. This performance exemplifies the robust effectiveness of classical ML models in the categorization of news, particularly when enhanced by a suitably tailored preprocessing pipeline.

Downloads

Download data is not yet available.

Article Details

Author Biographies

Mohammad Hossein Zolfagharnasab, University of Porto, Faculty of Engineering.

PhD student, Department of Electrical and Computer Engineering,

Faculty of Engineering, University of Porto,

Rua Dr. Roberto Frias, 4200-465 PORTO, Portugal

 

Siavash Damari, University of Allameh Tabataba'i, Department of Statistics, Mathematics, and Computer Science.

Master student,

Department of Statistics, Mathematics, and Computer Science,

University of Allameh Tabataba'i,

Western Azadi Stadium Blvd, Tehran, Iran