top of page

Cyborg Unveils Encrypted RAG Blueprint to Secure the AI Knowledge Stack

Cyborg has released a new blueprint designed to solve one of AI’s most urgent paradoxes: how to harness retrieval-augmented generation (RAG) without exposing the very knowledge it centralizes.


The Cyborg Enterprise RAG Blueprint, now available on NVIDIA’s build portal and GitHub, introduces the first open-source framework for fully encrypted RAG pipelines—securing vector embeddings and AI queries even while in use. The launch represents a leap forward in confidential AI architecture, powered by NVIDIA Nemotron open models, NeMo Retriever microservices, and GPU-accelerated computing.


“Today’s organizations want to unlock value from AI by centralizing their knowledge into a single vector database to make models more capable and context-aware,” said Nicolas Dupont, Founder and CEO of Cyborg. “That consolidation is fundamental, but it also creates a smaller attack surface with a much larger potential breach radius. Vector databases can therefore become an organization’s biggest liability or its greatest strength. Encryption-in-use addresses this paradox by enabling enterprises to embrace AI confidently without turning innovation into exposure.”

The Hidden Vulnerability Inside AI Memory


As AI models grow smarter through RAG—pulling from enterprise documents, codebases, and internal chat logs—those vector embeddings quietly become treasure maps to sensitive data. Organizations from healthcare to finance are consolidating massive knowledge stores into vector databases, which can inadvertently expose plaintext during search or query processing.


Security experts, including the OWASP Foundation, have flagged embeddings as an emerging threat surface. Traditional databases secure data “at rest” or “in transit,” but not “in use”—meaning information is decrypted in memory while being queried. Cyborg’s new architecture removes that weak link entirely.


Inside the Cyborg Enterprise RAG Blueprint


At its core, the blueprint fuses CyborgDB’s encryption-in-use engine with NVIDIA’s accelerated AI infrastructure. When data is parsed into embeddings via NeMo Retriever, those vectors are immediately cryptographically indexed into encrypted tokens. The tokens remain encrypted throughout their lifecycle—during storage, retrieval, and ranking—ensuring no plaintext ever exists in memory, caches, or logs.


When users issue queries, CyborgDB performs encrypted retrieval while forward-secure indexing protects against reconstruction attacks. Enterprises retain complete control of their encryption keys, which are generated and managed locally.


The result: sub-10 millisecond encrypted queries without compromising speed or accuracy—a milestone for confidential AI computing.


Built for Enterprise-Grade Performance


The blueprint’s deployment guide offers both Docker and Kubernetes configurations. A typical production setup requires either dual NVIDIA H100 GPUs or equivalent clusters of A100s, while a lighter configuration can run via NVIDIA’s NGC-hosted NIM service.


It ships with:


  • NVIDIA AI software stack (NeMo Retriever, Llama Nemotron 3.3)


  • CyborgDB with GPU acceleration via NVIDIA cuVS


  • Multimodal PDF parsing, table and chart extraction


  • NeMo Guardrails for contextual safety


  • OpenAI-compatible APIs and sample UI


In other words: a turnkey system for building a fully confidential AI knowledge base—one that scales across documents, multimodal data, and hybrid search, while remaining opaque to anyone but its rightful owners.


A Blueprint for the Future of Confidential AI


With this release, Cyborg positions itself as the architect of “confidential cognition”—AI that can think deeply without leaking secrets. By turning encryption-in-use into a production-ready reality, the company is closing one of enterprise AI’s most dangerous gaps.


The Cyborg Enterprise RAG Blueprint is available now on build.nvidia.com and GitHub.

bottom of page