Richard Yew of Edgio on the Evolution and Future of Advanced Bot Detection Techniques

Jun 23, 2024
5 min read

Updated: Jun 24, 2024

We sat down with Richard Yew, VP of Product Management - Security at Edgio to explore the evolution of bot detection methods in response to increasingly complex bot attacks and the main advancements driving this field. Richard also explains how request fingerprinting, client fingerprinting, and behavioral fingerprinting work together to enhance bot detection and combat credential-stuffing attacks.

How have bot detection methods evolved in recent years to address the increasing complexity of bot attacks, and what are the main advancements?

Bot detection methods have significantly changed over the last 10 years as it's a constant arm race between the attacker and the defender. It will only continue to do so at an even accelerated pace as the barrier of entry for an attacker to create and evolve automation scripts has been lowered with the help of AI (i.e. LLMs). Here are some notable milestones that illustrate the evolution of and advancements in detection techniques:

1. The original bot detection methods are mainly based on signatures from the request (i.e. user-agent and IP address matching). However, those signatures can easily be spoofed by attackers so any form of mitigation can be bypassed with very little effort.

2. New techniques were created to add additional layer of sophistication in the form of client-side JavaScript to fingerprint devices as well as incorporate Client fingerprinting with request/signature fingerprinting to create a composite logic to detect a bot. This method to detect attackers trying to spoof a real device (i.e. browser) worked for a while, but with the introduction of tools, like the headless browser or browser automation, client fingerprinting was no longer as effective.

3. Bot defenders are now incorporating advanced behavioral fingerprinting techniques by gathering metadata from both client and service side to enhance detection. This requires using various advanced data science techniques and machine learning to tell the difference between a real human and sporadic automated behavior.

4. Finally, there's also been advancements toward implementing more behavioral fingerprinting on the server side as server-side detection is harder to be reverse engineered compared to similar techniques done on the client side via JS or software development kits (SDKs).

Can you explain how the three forms of analysis — Request Fingerprinting, Client Fingerprinting, and Behavioral Fingerprinting — work together to enhance bot detection and defeat sophisticated credential-stuffing attacks?

All three techniques are not mutually exclusive as each has their advantages and disadvantages. A good bot solution will need to incorporate all these techniques to be successful in dealing with advanced persistent bot attacks, which as I alluded to, is always evolving. While request fingerprinting is rudimentary signature detection, a robust bot/client/IP signature database can provide instant protection from "first hit" attack where a pure behavioral/ML-based fingerprinting may need more time and sample size to start detecting and mitigation. On the other hand, behavioral fingerprinting, especially when powered by AI with robust platform traffic samples are able to constantly adapt to changes in bot behaviors. Client fingerprinting is an important instrumentation that gathers additional device/client telemetry to enrich the signature & behavioral detection model. Combining all three techniques will combine the best of each by being able to provide instant first-hit defense against many well-known bot attackers, as well as constantly keeping up with rapidly evolving unknown automated threats.

What are some of the most effective management responses to bot threats that organizations can implement to protect themselves from these increasingly sophisticated attacks?

Mitigating sophisticated bot attacks is always a battle between security and user experience. A poorly implemented bot defense may impact user agent by unnecessarily introducing additional delays in their browsing experience (e.g. relying completely on CAPTCHA and forcing it on all users) or be able to be detected by the attackers prompting them to change their behavior to bypass the protection quickly. An effective bot management involves:

1. Deploying mitigation natively inline on a high-performance platform to ensure minimal latency is being introduced when detecting bots.

2. Implementing multiple response actions. In many cases, blocking a bot attack may not be the best practices as it serves as a signal to bot operator that they have been discovered and it's time to change their script or attack behaviors. It's important to also have various other means of mitigating actions i.e. serving custom response from honeypot, redirection, tarpitting, delay/slow, browser challenges, targeted CAPTCHA etc. depending on the severity of the bot attack. This can delay the changes in attack behaviors and prevent the attackers from learning the defense techniques. Remember, all of the techniques should be done without introducing additional latencies to normal users.

3. Accounting for the "biological" bot. Attackers are increasingly exploring various creative ways to bypass detections. In recent years there's been a rise in manual fraud via hiring low-cost workers in click farms from low cost of living (LCOL) countries. Sophisticated behavioral fingerprinting should be implemented to ensure that repeated manual fraud can also be detected, and advanced user-interacting challenge can be implemented to massive reducing the impact of this kind of fraud.

4. Implementing defense-in-depth, by ensuring that your bot management solution is one of a comprehensive layers of web security protections that also includes WAF, Application DDoS, API defense. No one layer of security should be considered a be-all-end-all solution for a specific problem and implementing defense-in-depth via strategically layered security solutions will ensure the chance of bot attacks slipping by is minimized. Here's a practical example of defense-in-depth in action: in a layered protection, high volume rudimentary bot attacks that use the same signature can be quickly mitigated by rate limiting, while bot requests to a login/authentication endpoint that do not conform to a valid request JSON schema can be filtered out immediately. These filtering can essentially "shrink the haystack", improving the signal-to-noise ratio, so that the bot management layer can focus its resources on detecting sophisticated bot attacks thus achieving maximum efficiency and accuracy.

Could you share some real-world examples or case studies where these advanced bot detection techniques have successfully mitigated bot threats and improved security for organizations?

A world leading fintech company has experienced repeated credential stuffing attacks by bot that caused account takeover as well as service availability/performance issues causing users to be unable to access their accounts. The attackers were implementing low and slow attacks using highly distributed IP (10s of thousands) as well as randomly generated headers to defeat standard signature-based fingerprinting technologies. The above-mentioned advanced bot detection techniques had been implemented to detect and associate all these attacks with a particular attacker and was able to mitigate the request successfully via tar pitting techniques. This mitigation improves the availability of their service as well as preventing further compromise of user-account that resulted in financial data breaches.

Other examples include using advanced techniques as illustrated above to prevent inventory exhaustion attack on a popular airliner in Asia by preventing attacks causing tickets to be sold and refunding last minutes resulting in empty flight that cost company financial losses and brand reputation.