Data 360 Query & Insights: Style Guide

This is the page where Cleon stops describing what Data 360 (formerly Data Cloud) query is and starts saying what we do with it. Salesforce defines the surfaces. The reference pages in this subcategory document each one — the SQL dialect, the Query API, Calculated Insights, who consumes a result — and the gotchas document where the SQL instinct misleads. This Style Guide is the discipline that keeps a number trustworthy once a segment, an activation, and an agent are all standing on it.

Use it as a checklist before any new query or Calculated Insight ships. The rules are short on purpose — when a rule needs an explanation, the explanation is in the page it links to. One decision sits above all the others, so it goes first.

The central decision: Calculated Insight vs. live Query

Almost every query choice in Data 360 reduces to one fork: do you compute this metric once as a Calculated Insight and let everyone retrieve it, or do you run it live through the Query API each time it's asked? Get this right and the rest of the subcategory is detail. Get it wrong and you either pay to recompute the same number on every consumer, or you freeze a one-off pull into infrastructure nobody maintains.

Three questions decide it. Answer them in order.

1. Is this metric retrieved by more than one consumer, or by one consumer repeatedly?

A number that a segment, an activation, and an agent all read is a number you compute once and serve — a Calculated Insight. So is a number one consumer reads on every refresh. The moment "who needs this" has more than one answer, or "how often" is "again and again," you are past the threshold for a CI. A pull that happens once, for one caller, with no plan to repeat, is a Query API call — live, unpersisted, gone when it returns.

2. Does it need history or a profile join, or is it a windowed, incoming-events metric?

Lifetime value, all-time order count, an engagement score over months — these need the full history and a join back to the resolved profile. That is a batch Calculated Insight, and only batch can express it. "Clicks in the last 30 minutes," a spike off the incoming event stream with no profile attribute — that is the narrow window a streaming CI covers, and reaching for it without knowing its limits is the classic mistake (see Calculated Insights on batch vs. streaming). If the metric is neither — a genuinely ad-hoc shape you don't know in advance — it's a live Query API call, not a CI at all.

3. How fresh must it be?

Freshness is not "as fresh as possible." It is as fresh as the freshest decision the number feeds. A metric a human reviews once a day does not need an hourly recompute; a real-time trigger cannot wait on a daily batch. The freshness requirement sets the CI's cadence — or, if the answer is "exactly now, every time," it pushes you to a live query. Decide it on purpose and write it down.

This is principle 7 (data-cloud-principles) stated as a decision you make at a keyboard: compute once, retrieve many. The Query API is for questions asked once, by code; a Calculated Insight is for an answer retrieved many times, by everyone.

SQL conventions

The conventions that keep a Data 360 query readable and correct a year after it's written. Each is one rule; the dialect page carries the detail.

Query DMOs, not DLOs

Build every query on the harmonized DMO, not the raw DLO. A query against a DLO inherits the source system's naming and mess, and breaks silently the day that source renames a column (see gotchas — gotcha 1). The lake is a landing zone, not a query surface.

Use `ssot__`-namespaced identifiers, unquoted — quote only when you must

Data 360 identifiers carry the ssot__ namespace and __dlm / __dll / __cio suffixes, and they are case-mixed. Write them unquoted for readability; reach for double quotes only when a name would otherwise fold its case or collides with a reserved word. Quote the exception, not the rule.

Prefer single-object aggregations

A Calculated Insight, and most queries, read cleanest as an aggregation over one object at the right grain. Multi-object CIs are real and necessary, but every join is a dependency on a modeled relationship — keep the shape as flat as the metric allows, and add objects only when the number genuinely requires them.

Traverse source → unified through `IndividualIdentityLink__dlm` — never a direct join

A source-aligned object does not join straight to UnifiedIndividual__dlm. The path runs through the IndividualIdentityLink__dlm bridge that identity resolution maintains (the exact link and source field names follow your org's model). A direct join is not a path the engine offers — and a traversal nobody modeled is a question the query simply can't ask (see gotchas — gotcha 8).

Remember CI authoring uses a different dialect than reading the CIO

Creating a Calculated Insight uses a different SQL dialect than the one you read it back with. Salesforce documents this plainly. Authoring a CI has its own rules and — for streaming — the WINDOW syntax; reading the resulting CIO is ordinary Data Cloud SQL. A function that works in one may not exist in the other. Don't conflate them.

Cost discipline

Process less — cost scales with what you process, not what you store

Data 360 bills the work: the rows a query scans, the data a CI recomputes — far more than the data at rest (principle 11). A CI that re-aggregates the whole profile every hour, or a segment that scans everything on each refresh, is a cost decision wearing a logic decision's clothes. Design for how much you process, and revisit the expensive ones. The cheapest query is the one you didn't run — and the second cheapest is the Calculated Insight you computed once instead of re-deriving on every consumer.

Freshness discipline

Set a CI's cadence to the freshest decision it feeds — and write the cadence down

A Calculated Insight is exactly as fresh as its last run, and the staleness is silent: between runs it serves the last number it computed, with nothing in the consumer to say the world moved on (principle 6). Set the cadence to the freshest decision the CI feeds — not to "as often as possible," which pays for freshness nobody consumes, and not to a daily batch behind a real-time trigger. Then write the cadence down next to the CI, so the next person doesn't assume real-time where there's a 24-hour lag.

Patterns to prefer

A Calculated Insight for any number read by more than one consumer — compute once, retrieve many.
Batch CIs by default, streaming only when the metric is genuinely a windowed reaction off the event stream.
Single-object aggregations at the grain the metric is actually true at.
Unquoted ssot__ identifiers, quoting only the case-mixed or reserved exception.
The full retrieval loop on every Query API call — page to done, or poll the queryId to the end — written even when today's result fits one page.
The cadence written next to the CI, in the model doc, not in someone's memory.

Patterns to refuse

A live query re-run on every consumer for a number that should be a CI.
Streaming reached for because "real-time is better" before checking it can't join the profile or hold a lifetime horizon.
A query or CI against a DLO when a DMO expresses the meaning.
A direct join from a source object to UnifiedIndividual__dlm instead of the IndividualIdentityLink__dlm traversal.
Reading page one of the Query API and stopping — the worst kind of wrong, because the numbers look plausible.
A CI cadence set to "as fresh as possible" instead of to the decision it feeds — an over-eager recompute is a recurring bill.
"We'll figure out the grain later." The grain is the metric; get it wrong and every consumer inherits the wrong number, silently.

The agent-readiness check

The most modern reader of your query layer is an agent — Agentforce or an external LLM. It does not recompute a metric; it retrieves the Calculated Insight you defined, inheriting its grain, its freshness, and its correctness exactly (principle 10). A wrong CI doesn't make an agent hesitate; it makes the agent confidently wrong, which is worse than an agent that can't answer at all. Before you let an agent ground on a query result, confirm:

The shared metrics the agent will quote exist as Calculated Insights it retrieves — not values it (or a tool) recomputes live.
Each of those CIs is defined at the grain the metric is actually true at.
Each CI's refresh cadence matches the freshness the agent's answer implies.
Every traversal the metric needs runs through a modeled relationship (source → unified via IndividualIdentityLink__dlm).
The number reconciles — you're counting unified individuals or source rows on purpose, not by accident.

The honest test underneath all five: would a human analyst quote this number without a caveat? If not, neither should an agent. "Agent-ready query" is not a feature you switch on — it's the state you're already in when the CIs are correct, fresh, and defined at the right grain.

The pre-ship checklist before any query or CI ships

The compute-vs-retrieve call was made on purpose — CI if retrieved-many or historical, Query API if one-off or live.
CI: batch vs. streaming chosen by the metric (history/profile join vs. windowed events), not by the aspiration.
CI: the grain is declared and correct — the dimensions express exactly the grain the metric is true at.
CI: the refresh cadence is set to the freshest decision it feeds, and written down next to it.
The query reads DMOs, not DLOs; identifiers are ssot__-namespaced and unquoted except where they must be quoted.
Every source → unified traversal runs through IndividualIdentityLink__dlm, and every join corresponds to a modeled relationship.
Query API: the caller handles the full retrieval loop — pages to done or polls the queryId to the end — even if the result fits one page today.
The number reconciles against the object you meant to count (unified individuals vs. source rows).
The agent-readiness check above still passes.

When all of them fire, the query or CI is ready to ship.

Query & Insights gotchas — where the SQL instinct misleads, the production version
Data Cloud SQL — the dialect these conventions apply to
Bridging from Marketing Cloud SQL — the crosswalk for the MC SQL veteran
Calculated Insights — the "compute once, retrieve many" side of the central decision
The Query API — the "query live every time" side of the central decision
Consuming query results — who retrieves a CI, and why they all read the same one
Debugging query results — when a number comes back wrong, blank, or short
Data 360 principles from production — the meta-rules above these specifics (6, 7, 10, 11)

If you spot a rule missing — or one of these rules being violated in our public work — write to hello@wearecleon.com. We add it, or we fix it and we say so.