YorkSpace has migrated to a new version of its software. Access our Help Resources to learn how to use the refreshed site. Contact diginit@yorku.ca if you have any questions about the migration.
 

Machine Learning-Based Defences Against Advanced 'Session-Replay' Web Bots

Loading...
Thumbnail Image

Date

2024-03-16

Authors

Sadeghpour, Shadi

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The widespread adoption of the Internet has brought about significant benefits for modern society, but has also led to an increase in malicious activities, particularly through the use of web bots. While some bots serve useful purposes, the proliferation of malicious web bots poses a significant threat to Internet security, impacting individuals, businesses, governments, and society as a whole. The emergence of AI-powered web bots capable of mimicking human behavior and evading detection has further exacerbated this problem. This dissertation aims to deepen our understanding of advanced web bots and the web bot attacks that often signal fraudulent online activities. In particular, we focus on session-replay web bots, the latest and most advanced type of web bots, which present an especially difficult challenge in online domains where multiple genuine human users frequently exhibit similar behavioral patterns, such as news, banking, or gaming sites. To achieve our research objectives, we have meticulously curated an extensive dataset encompassing both human and bot-generated data. Additionally, we have developed our own prototype of advanced session-replay bot (the so-called ReBot), which has enabled us to accurately simulate the attacks conducted by this particular category of web bots. Moreover, by infusing randomness into the design of ReBot, we have been able to achieve varying degrees of bot and attack evasiveness. From the defenders perspective, and by leveraging state-of-the-art deep learning algorithms, we have proposed several effective strategies for detection of advanced session-replay bot attacks. One of our proposed techniques deploys the concept of moving-target defence in the form of webpage randomization which is particularly challenging for the attacker to overcome. This thesis also explores the utilization of generative machine learning models for the purpose of generating synthetic bots sessions. The ability to synthesize advance session-replay bots - as opposed to looking for real-world instances of these bots or evidence of their activity in real-world logs - is of critical importance if we are to make timely and effective advances in the field of web bot detection and defence.

Description

Keywords

Computer science

Citation