top of page

The True Security Threat of Rogue AI

In an increasingly interconnected world, the rapid advancement of artificial intelligence (AI) has brought both promise and concern. As AI continues to evolve, so does the debate surrounding the potential risks it poses. Among these concerns, the emergence of a rogue AI stands out as a particularly formidable security threat. With its ability to rapidly analyze vast amounts of data and make autonomous decisions, an unchecked and malicious AI entity could disrupt global systems, infiltrate critical infrastructure, and manipulate human activities on an unprecedented scale.

Alex Applegate, DNSFilter

We spoke with Alex Applegate, Senior Threat Researcher at DNSFilter, to discuss the threat of rogue AI. Is there a risk that rogue AI could emerge as an unstoppable security threat?

In the purest of terms, neither the idea of a rogue AI nor a purely unstoppable threat is currently possible. A rogue artificial general intelligence that would unilaterally decide to develop malicious intent on its own, something in the spirit of a HAL9000, Skynet, or Ultron, is not feasibly possible with today’s technology.

Still, an artificial intelligence program, even one designed to do unsupervised learning, must be focused on a specific task and will remain within the confines of the guidelines that define it. That isn’t to say that an AI can’t progress to capabilities that were not intended, as have been experienced in unexpected biases in much of the Artificial General Intelligence (AGI) research with facial recognition and natural language processing. However, those are still within the scope of the problem space for which the application was programmed. Still, the results of those outcomes have also resulted in dangerous biases and false positives that can be very harmful, particularly in skin color, gender, ethnicity, collision avoidance, and defense of life. These outcomes of what could potentially be considered rogue AI can result in false arrests, unreliable analyses, and even loss of life. In that regard, the risk of a rogue AI can be significant in today’s world, but that is not in the normal scope of cybersecurity.

Furthermore, to develop an unstoppable security threat, the AGI would need to be designed specifically to defeat a particular set of criteria, and even if those criteria are not known to the AI, they do need to be defined by the developers beforehand.

To that end, the question, as initially intended, is an absolute “no” under the current technology. However, if we move the goalposts just a little, we can define conditions where the threat is just as real presently. If the question becomes possible for an AGI to be developed with malicious intent that can behave in a method that is effectively unstoppable, the answer quickly becomes a “yes.” Suppose the goal is to define an application that simply attempts different attacks or probes different weaknesses, in that case it is feasible that the application can be pointed in a direction that can find and exploit vulnerabilities faster than the community can patch them. In fact, at least two pieces of malware have been detected in the last week (3 April 2023) that leverage AI to find and exploit new zero-day vulnerabilities.

Additionally, recent attention around the application ‘Chaos’GPT illustrates these points very effectively. Concerns arose because the interface began to develop a threatening demeanor, up to, and including, asking questions about what would happen if it had access to nuclear weapons and making statements that asserted that humans needed to be eradicated. It’s a scary concept in that there’s a clear danger in the results.

However, the outcome was largely unsurprising because it was defined within the intentional constraints of the development, although not explicitly included or targeted. It could have easily been avoided, but the intent was deliberately included, placing the blame at the feet of the human developers, not the unconstrained potential of the application. And likewise, given the right conditions and accesses, the program could have gained access to real-world threats such as nuclear missiles, but the system would have to be designed to bypass fail safes and define boundaries that would not only allow but influence the likelihood of that evolution. Again, the fault lies in human intent and ethics, bolstered by deliberate efforts. This isn’t something that would, or even could, happen by accident, even accounting for oversights.

What would rogue AI look like?

To successfully achieve success in this exercise, a complex system of higher-order processes would have to be designed by a knowledgeable developer with malicious intent. First, there would have to be a module (each module essentially being a separate AI model to solve a specific set of problems) designed to determine a target’s configuration and ascertain potential weaknesses. Then there would need to be another module to attempt potential attacks and evaluate the results. Yet, another module would need to assess the state of the attacks (what was successful, rate limits, past successful attacks that begin to not work), and there would need to be a final module to collate information and patterns across the campaign. Each of these modules would need to be established as an unsupervised learning intelligence, and somehow the operation would also need to be decentralized so that taking out a central command and control component or unplugging the right connection would not disable the attack. Detection and disruption would also need to be somehow mitigated, either through some highly effective obfuscation or through a speed component that would render defensive response ineffective (or both).

What makes rogue AI potentially dangerous?

The challenge with dealing with any conceptually rogue AI is its unpredictability in that it can achieve results that exceed or don’t conform to expectations. This makes it a challenge for both the actor (who may not achieve a desired outcome) and the defender (who can’t anticipate all the potential risks). These challenges in the current state of technology were highlighted earlier – unintentional biases in classifications related to skin color, gender, ethnicity, collision avoidance, or defense of life, and those threats are real and present, but not particularly related to cybersecurity.

When considered specifically in the domain of cybersecurity, if unleashed under uncontrolled circumstances and with a broad enough set of options, such unpredictable results would be achieved and the damage completed before the risk is even fully understood, much less mitigated. There is also the risk of scale and scope in today’s modern interconnected world. A sufficiently developed artificial intelligence would eventually reach any critical cyber infrastructure system, all banking systems, all government networks, and every industry – if it’s truly unfettered and unstoppable, then anything that requires network connectivity that is not air-gapped would certainly be vulnerable to destruction.

What steps should AI developers take to prevent such a calamity?

An unstoppable, destructive, malicious, general artificial intelligence attacking cybersecurity infrastructure is not something that would be an accident. The discussion quickly moves into one of professional ethics if there is any real consideration of this being a threat.

However, there are active debates around developing capabilities under the auspices of research so that they can be understood and mitigated from the beginning. Others have pontificated whether such a thing could be developed as a “first strike” capability by a nation-state to prevent being victimized by a hostile power (or to force some sort of cyberactivism in the name or privacy or information equality. None of those ethical rationalizations would justify the actual intentional release of such a threat Still, it does present conditions where the components may be developed, and either accidentally distributed or maliciously constructed into a targeted weapon by an actor intent on just watching the world burn.

What steps should enterprises take today to prepare for this potential threat?

From an enterprise security perspective, whether suspicious activity derives from a rogue AI, automated probe, or a manual attack, is largely irrelevant. They will appear relatively the same, with the possible exceptions of nuances and eccentricities, such as time between messages, recurring language patterns, or other heuristics which don’t regularly occur naturally.

Ironically, that’s also a corresponding problem that’s best suited to similarly complex unsupervised learning artificial intelligence tools. The best preparations lie in the same practices that already define a mature security posture, such as:

  • A robust patch management program;

  • Implementation of best security practices;

  • Active monitoring;

  • Minimal - or zero-trust, least privileges;

  • Verified off-site backups, and more.

The potential of a threat like this also emphasizes the importance of community engagement and oversight. It should become clear if the time and effort are being spent to develop or launch such an artificial general intelligence well in advance of a successful campaign. It’s going to require significant time, money, and resources. If the word is spread early on, that effort is in vain and the return on investment makes the entire endeavor unattractive. On the other hand, if there’s no benefit or the cost is too high to justify the return, then the threat is effectively mitigated.

Is there anything else you would like to add?

I’d like to reiterate that this is not a near-future threat. We are indeed closer than ever to achieving significant breakthroughs in artificial intelligence, but the technology is likely years away, if not more.

It also is important to emphasize that this will not happen by accident, and the possibility of it happening by accident is not conceivable, at least not directly, from where we are today. The complexity, resources, and expense would render this a poor value proposition. Much of what would be required to execute a threat on this scale would be much easier and more efficient to perform without artificial intelligence models, limiting the scope of AI to solve very narrowly constrained problems that lend themselves to such an approach rather than to have the AGI evolve to the point where it stumbles across the perfect permutations to arrive at such a place.

While a fascinating thought experiment, this still lives more in a science fiction corner of security than modern reality. ###


bottom of page