External RAG: the grounding pipeline you own

You go off-platform for grounding when the corpus isn't where Salesforce can reach it. The knowledge lives in document stores, wikis, product manuals, or systems no platform retriever indexes; or the pipeline needs a shape you control end to end; or the data spans sources beyond a single Data 360 profile. That is when an external RAG pipeline is the right instrument (principle 7) — not because it is more advanced, but because it fits a corpus the platform path doesn't reach. This is the "beyond Salesforce" side of grounding, and it earns its place exactly when the data has left the platform.

The one sentence that defines the external path: you own the pipeline. Agentforce retrievers hand you the search index, the retriever, hybrid search, and grounding inside the Salesforce security model for free — you point them at a Data 360 profile, not at a pipeline you assemble. Off-platform, you build every stage yourself. That is the cost and the freedom in the same breath, and this page is mostly about being honest about both. The pipeline itself — chunk to grounded answer — is what is grounding; this page does not re-teach it, it is about who builds it.

The stack: four instruments, composed

An external RAG pipeline is usually four pieces working together, and Cleon composes them rather than treating any one as the whole answer:

LangChain — orchestration. It wires the pipeline together: loaders that pull source documents in, splitters that chunk them, the retriever that searches at query time, and the chain that hands the retrieved context to the model. The pipeline mechanics — chunk, embed, store, retrieve, augment, generate — are covered in what is grounding and chunking and embeddings; this page does not re-teach them. What matters here is who assembles the chain: off-platform, you do. There is no managed retriever; the chain you build is the pipeline.
A vector store you run — the index. Off-platform the embeddings live somewhere you operate: a dedicated vector database, or a vector extension on a database you already run. Either way it is yours to stand up, index, and keep healthy — the platform's managed search index is exactly what you are replacing. The right choice is a category decision, not a single product, and it turns on scale, the filtering you need, and what you already operate.
An embeddings provider — paired with Claude. Recall from chunking and embeddings that there is no first-party Claude embedding model. You pair Claude — for the reasoning and the generation — with a dedicated embeddings provider for the vectors, two components from two vendors. This is normal; embeddings are their own specialty. The model that writes the answer and the model that embeds the chunks are not the same one.
The Claude API — generation. The retrieved chunks, the question, and the instructions go to the Claude API, which reads the augmented prompt and writes the grounded answer. Direct API access is what gives you control over how you call it and a capability path the platform may not expose — the same direct-call relationship the external agent uses for its reasoning core.

None of these competes with the platform path. They are the instruments you reach for when the corpus is off-platform; the skill is composing them, not defending them against Agentforce retrievers.

What you own off-platform

This is the heart of the difference. Inside the Salesforce security model, Agentforce retrievers give you the index, the retriever, hybrid search, and security-model grounding by construction — you supply a clean Data 360 profile and point the retriever at it. Off-platform, every one of those stages is yours to build, tune, and keep working:

Chunking and the embedding-model choice. How you cut the documents and which embedding model you run — the two upstream levers from chunking and embeddings — are decisions you make and live with. Get them wrong and no downstream tuning recovers it.
The vector store and its index. You stand it up, you load it, you keep it indexed and fast as the corpus grows. The managed search index did this for you; here it is operations you own.
Retrieval tuning. Hybrid search, re-ranking, the k you retrieve, metadata filtering — the whole retrieval quality toolkit is a set of levers you pull yourself against your own numbers, not defaults handed to you.
Permission filtering at retrieval. This is the one a demo never shows. The platform filtered by the running user's sharing rules by construction; off-platform you build that filter, on metadata you indexed for exactly it, applied before ranking. Skip it and the index returns chunks the user could never open through the front door — gotcha 8 in grounding gotchas, and it is not optional.
Freshness and re-indexing. The source of truth changes; your index does not change with it unless you make it. The refresh path, the staleness contract, the incremental re-embed — all yours to design before you ship, or the index quietly ages into fiction.
The eval set. Retrieval quality is only trustworthy once it's measured (retrieval quality), and off-platform the query → the chunk that should come back eval set is yours to build and run on every change.

This is principle 4 stated plainly — the model is the easy part; the pipeline is the job. Calling the Claude API over some retrieved text is an afternoon. Building the chunking, the index, the retrieval tuning, the permission filter, the freshness contract, and the eval set around it is the quarter, and off-platform that whole bill is yours.

When external is the right call

The external path is the right instrument, not the advanced one, in a specific set of cases:

The corpus lives outside Salesforce — the documents, the knowledge base, the manuals genuinely sit elsewhere, and there is no reason to pull them into a Data 360 profile just to retrieve over them.
A pipeline or vector store the platform doesn't offer — when you need a specific chunking strategy, retrieval technique, or a vector store Salesforce doesn't provide, and direct control of the pipeline is what gets it.
Multi-source beyond Data 360 — when grounding has to span several sources at once, more than a single platform profile is the right unit for, and you assemble the corpus yourself.
Portability across hosts — when you want the pipeline and its index to move between environments rather than be welded to one platform.

If none of those is true and the data is already in Data 360 and the work lives in the Salesforce security model, the platform path is very likely the better instrument. External is not the upgrade — it is the right answer to a different question. If the corpus is governed customer data sitting in a clean Data 360 profile, building a separate pipeline to retrieve over a copy of it is usually a step backward, not forward.

The cost of leaving the platform

Here is the part a demo never shows. Everything Agentforce retrievers gave you for free is now yours to build and keep working:

The search index. No managed index over a Data 360 profile — you stand up the vector store, load it, and keep it indexed as the corpus grows. The operations the platform absorbed are now yours.
Hybrid search and re-ranking. No retriever that already merges semantic and keyword search for you. The retrieval quality levers are yours to build, wire, and tune against your own eval set.
Security-model grounding. No sharing rules, field-level security, or "runs as this user" baked into retrieval. Whatever access control the corpus needs, you implement and enforce as a permission filter at query time — and a pipeline that skips it is gotcha 8 waiting to happen.
The freshness contract. No platform keeping the index current with the record. You design how an edit reaches the index, how long the lag is, and which sources change often enough to need more than a nightly rebuild.

This is principle 4 again, from the grounding side. The model is an afternoon; the pipeline is the quarter. Off-platform the whole pipeline is yours to build and operate — and principle 6 holds the entire time, because the embedding bill and the re-index latency are line items you design for, not surprises you discover when the corpus hits a million documents. Budget for it before you leave the platform, not after the first incident asks why retrieval surfaced a chunk a user was never allowed to read.

Composition, not competition

The honest picture is not external or platform — it is often both. A real system grounds Agentforce over a Data 360 profile for the in-platform, governed data where the work lives in the security model, and runs an external RAG pipeline for a corpus that lives elsewhere — a product manual, a document store, a knowledge base no platform retriever indexes — composed into one grounded experience. Agentforce retrievers own the part that needs the platform's security model; the external pipeline owns the part whose corpus is off-platform. Neither is winning; each is grounding the data it is the right instrument for.

That is the whole stance of this subcategory restated from the off-platform side. The external path is the control and reach you get when the corpus has left the platform — and a pipeline you signed up to build and operate the moment you did. Compose it with the platform path (principle 7), respect what you now have to build yourself (principle 4), and the off-platform pipeline becomes an instrument you control rather than one that quietly misinforms.

Agentforce retrievers — the complementary platform path, where the index, the retriever, and security-model grounding come baked in
What is grounding — the chunk → embed → store → retrieve → augment → generate pipeline this page builds without re-teaching
Chunking and embeddings — the upstream levers you own off-platform, and why you pair Claude with an embeddings provider
Retrieval quality — the hybrid search, re-ranking, and k tuning you wire yourself off-platform
Grounding gotchas — the permission-leakage, stale-index, and empty-retrieval failures you own when you leave the platform
External agents — the off-platform agent counterpart, the loop you own to this page's pipeline you own
Grounding Style Guide — the bar an external pipeline clears before it ships
AI Engineering principles — compose the toolkit (7), the system is the job (4), cost and latency are features (6)

Reference:

The stack: four instruments, composed

What you own off-platform

When external is the right call

The cost of leaving the platform

Composition, not competition

Related