DIGITAL FOOTPRINTS: CLUSTERING BROWSER HISTORY FOR USER PROFILING USING MACHINE LEARNING
DOI:
https://doi.org/10.26577/jpcsit20253202Keywords:
digital footprints, user profiling, clustering, anomaly detection, browsing behavior, machine learningAbstract
This study explores the use of unsupervised machine learning techniques to analyze historical web activity, segment users, and detect anomalies for user profiling. By applying hierarchical clustering and Gaussian Mixture Models, we identified distinct browsing behaviors, categorizing users into four to five groups, including general browsing, social media engagement, high-bandwidth consumption, and automated system processes. For anomaly detection, One-Class SVM and Isolation Forest were employed to flag deviations from expected behavior. The results indicate that approximately 5% of sessions were classified as anomalous by SVM, while Isolation Forest highlighted outliers associated with extended session durations and potentially high-risk application usage. These findings underscore the effectiveness of machine learning in distinguishing user behavior through digital footprints while identifying potential security threats or atypical usage patterns. The study demonstrates that unsupervised learning can serve as a valuable tool for user profiling and behavioral analysis, with implications for cybersecurity, network monitoring, and online behavior modeling. Integrating clustering with anomaly detection provides a scalable approach for uncovering usage trends and deviations in web traffic. Future research should expand dataset coverage and incorporate adaptive models to enhance classification accuracy and responsiveness to evolving web behaviors