How Generative AI Provides More Control Over the Domain Estate

Aug 21, 2023
4 min read

This guest blog was contributed by Randal Pinto, CTO and co-founder, Red Sift

Nearly two years ago, NSA Cybersecurity Director Rob Joyce tweeted a hard truth.

Attackers put in the time to know the network and the devices better than the defenders. That’s how they win.

The truth is this: attackers know most, if not all, of the vulnerabilities that exist in a network. These vulnerabilities exist everywhere -- the operating system, the applications that run across organizations and the identities that interact with the applications. Vulnerabilities also may exist in the form of devices that lack proper security controls and in-network services and certificates that expose organizations’ services to the public internet.

The average Fortune 500 company takes 12 or more hours to find a serious vulnerability, while bad actors focusing on a site takes less than 45 minutes. This means that discovering assets that are unknown, short-lived, or impersonating is critical, and discovering them quickly is equally as critical.

This is no small feat, however, especially when it comes to assets like website domains and internet certificates such as SSL or TLS. Think about an organization’s registered domain. There are numerous, interrelated dimensions that must be understood, such as when the domain was registered, what certificate it uses and all other configuration aspects of its known domain names and hosts. You might think that you know your digital estate but non-production hostnames, new IP addresses can be created without your knowledge. In addition, things like M&A might bring a whole new set of digital assets under your company's control that have to be discovered and secured. This encompasses hundreds of millions of signals, all of which must be analyzed to extract the organization’s unique identities. Keeping track of all of these is usually beyond human cognitive capabilities but perfect for AI.

Old vs. new-school AI

This is emblematic of a long-standing challenge in cybersecurity: cutting through the noise to focus on the things that matter. The challenge is rooted in the limits of the cognitive capacity of humans and solving for it was the provenance of AI in cybersecurity.

To understand cybersecurity data, security professionals need to find patterns and identities within those signals so they can group them together. This gives them a different dimension on the data that they’re analyzing. In the case of an organization’s identities that exist within network assets, they must be classified according to their significance relative to potential vulnerabilities. Trying to do this with an old-school deterministic model is nearly impossible because of the nature of the dimensions mentioned above.

Let’s pause for a moment to further define identities in this context. I’m referring to unique queries that use the metadata parsed from an organization’s existing inventory as search terms. Think of them as signals or snippets of interesting information associated with domains that can be used to discover potential assets that haven’t yet been found or parked or forgotten and left unsecured.

There are several examples of this set of identities:

Domain name registration identities (from WHOIS and RDAP)
Domain name email addresses
Organization names hidden in SSL certificates
DNS records, such as MX or SOA
Mail server relationships
Private name servers

To manage this aspect of an organization’s attack surface, security teams must monitor and analyze all configuration aspects of its domain names and hosts to extract its unique identities. This requires a fire hose of data and the wider the hose, the more likely those teams are to find the relevant points of interest.

The downside is that this introduces significantly more noise. Determining the relevance of the signals and patterns within them goes beyond the limits of human cognitive ability to analyze at scale. This creates another extremely powerful use case for generative AI.

Connecting the dots to secure vulnerabilities

Another example of a set of signals that must be monitored is newly registered domains, which attackers impersonate to launch sites for phishing attacks. With new phishing sites launching every 20 seconds and the number of unique phishing sites increasing exponentially, attackers hold a distinct advantage over security teams who lack the visibility to pre-empt attacks.

The ideal state for security teams is being able to connect the dots between the unique identities and new domain names associated with those identities. Generative AI provides them the ability to automatically scan each identity and generate a custom recommendation that tells them whether an identity should be enabled and why. This then allows them to decide which identities to include in their domain asset discovery process.

For large organizations that struggle with network monitoring, this capability is invaluable. It’s easier for smaller organizations to build a comprehensive list of domain names and hostnames but it becomes exponentially more difficult for large, distributed organizations, especially those that have grown through years of mergers and acquisitions. For organizations that do successfully generate and maintain comprehensive asset inventory lists, they are still difficult to evaluate given their volume and complexity.

Generative AI removes that pain by automating the identity discovery and classification process and providing security teams with recommendations for full identity enablement.

After all, the more visibility they have of their domain estate, the faster and better they can secure any vulnerabilities that exist. ###