DIGITAL FOOTPRINTS: CLUSTERING BROWSER HISTORY FOR USER PROFILING USING MACHINE LEARNING

Authors

DOI:

https://doi.org/10.26577/jpcsit20253202

Keywords:

digital footprints, user profiling, clustering, anomaly detection, browsing behavior, machine learning

Abstract

This study explores the use of unsupervised machine learning techniques to analyze historical web activity, segment users, and detect anomalies for user profiling. By applying hierarchical clustering and Gaussian Mixture Models, we identified distinct browsing behaviors, categorizing users into four to five groups, including general browsing, social media engagement, high-bandwidth consumption, and automated system processes. For anomaly detection, One-Class SVM and Isolation Forest were employed to flag deviations from expected behavior. The results indicate that approximately 5% of sessions were classified as anomalous by SVM, while Isolation Forest highlighted outliers associated with extended session durations and potentially high-risk application usage. These findings underscore the effectiveness of machine learning in distinguishing user behavior through digital footprints while identifying potential security threats or atypical usage patterns. The study demonstrates that unsupervised learning can serve as a valuable tool for user profiling and behavioral analysis, with implications for cybersecurity, network monitoring, and online behavior modeling. Integrating clustering with anomaly detection provides a scalable approach for uncovering usage trends and deviations in web traffic. Future research should expand dataset coverage and incorporate adaptive models to enhance classification accuracy and responsiveness to evolving web behaviors

Downloads

Download data is not yet available.

Author Biographies

Marzhan Idrissova, Astana IT University, Astana, Kazakhstan

Marzhan Idrissova is a master's student of the Computer Engineering Department at Astana IT University (Astana, Kazakhstan, marzhanidrisova@gmail.com). Her research interests include machine learning applications for user profiling, anomaly detection in digital footprints and network traffic analysis. ORCID ID: 0009-0008-0629-7948.

Sabina Kim, Astana IT University, Astana, Kazakhstan

Sabina Kim is a bachelor student of the Department of Computational and Data Science at Astana IT University (Astana, Kazakhstan, kimsabina206@gmail.com). Her research interests include the development of smart city solutions leveraging data science, as well as the integration of medicine with data science and machine learning to advance healthcare innovation. ORCID ID: 0009-0008-3198-0474.

Beibut Amirgaliyev, Astana IT University, Astana, Kazakhstan

Beibut Amirgaliyev is a distinguished researcher at Astana IT University (Astana, Kazakhstan, beibut.amirgaliyev@astanait.edu.kz) recognized for his contributions to both academia and industry. He holds a PhD in Computer Science and serves as a Professor at Astana IT University, focusing on research areas such as machine learning and computer vision. Dr. Amirgaliyev has published numerous papers on topics including automatic number plate recognition and solar collector systems, with his work being cited by over 200 researchers. ORCID ID: 0000-0003-0355-5856

Didar Yedilkhan, Astana IT University, Astana, Kazakhstan

Didar Yedilkhan is a distinguished researcher at Astana IT University (Astana, Kazakhstan, d.yedilkhan@astanait.edu.kz), recognized for his extensive experience in industry, research, and higher education. He serves as the Director of the Smart City Research Center and is a Senior Researcher at Astana IT University, focusing on data science, machine learning, and deep learning. Dr. Yedilkhan has a robust academic background, holding degrees from institutions such as the Kazakh National University named after al-Farabi and the University College London. His professional roles include being a Lead Researcher and Project Manager at Astana IT University, where he leads projects on intelligent IT systems for urban infrastructure. His projects aim to enhance city safety and convenience through smart technologies. ORCID ID: 0000-0002-6343-5277

Leila Rzayeva, Astana IT University, Astana, Kazakhstan

Leila Rzayeva is a prominent academic and researcher based at Astana IT University (Astana, Kazakhstan, l.rzayeva@astanait.edu.kz), where she serves as an Associate Professor and Researcher in Intelligent Systems and Cybersecurity. She earned her B.S., M.S., and Ph.D. degrees from L.N. Gumilyov Eurasian National University. Her research interests include control systems, industrial automation, cybersecurity, machine learning, deep learning, and the design of neural networks and artificial intelligence systems. Rzayeva has published over 40 national and international research articles and actively participates in conferences. ORCID ID: 0000-0002-3382-4685

        16 1

Downloads

How to Cite

Idrissova, M., Kim, S., Amirgaliyev, B., Yedilkhan, D., & Rzayeva, L. (2025). DIGITAL FOOTPRINTS: CLUSTERING BROWSER HISTORY FOR USER PROFILING USING MACHINE LEARNING. Journal of Problems in Computer Science and Information Technologies, 3(2), 16–28. https://doi.org/10.26577/jpcsit20253202