Debugging ingestion: when the data didn't land the way you expected

A Data Stream is feeding wrong data into Data 360 (formerly Data Cloud) and nothing told you. Maybe a count that should have grown is flat. Maybe it doubled overnight. Maybe a customer who unsubscribed last week is still in an activation, or a record you know exists never showed up. The stream looks healthy — the last run is green, the connector is connected, the schedule says daily — and ingestion almost never throws for landing wrong data, only for failing to run at all. So the diagnostic is the same shape every time: confirm the run happened before you ask what it did, then read the DLO's row counts before you blame anything downstream. The bug is almost always in one of five places, and they fail in a fixed order.

The thing to internalize before you start: everything here is read at the DLO (__dll) — the raw landing object the stream produces. This page debugs whether the right rows landed, not whether they were mapped to the right meaning. A value that landed correctly but reads wrong in a segment is a mapping bug one layer down (mapping, debugging mapping failures); a number that's wrong only in a query is a query bug (debugging query results). Ingestion owns one question: did the rows the source holds end up in the DLO, the right number of them, fresh? Confirm that first, because every layer above assumes it.

The steps

[ STEP 1 — Did the refresh even run? (schedule + connector auth) ]
        ↓
[ STEP 2 — Are the row counts what you expected? (full vs upsert) ]
        ↓
[ STEP 3 — Records missing or duplicated? (primary key / dedup on upsert) ]
        ↓
[ STEP 4 — Records that should be gone are still there? (upsert-no-delete) ]
        ↓
[ STEP 5 — Mapping/validation failures at the DLO? ]
        ↓
[ STEP 6 — Verify a refresh actually landed (last-run + counts before/after) ]

Walk down in order. A wrong count at Step 2 is meaningless if the refresh never ran (Step 1), and a "missing record" at Step 3 is a different bug from a "stale record that won't leave" at Step 4 — opposite fixes, so naming which one you have comes before touching the stream. Steps 1 through 5 each own a distinct failure; Step 6 is the verification you run after any fix, because in ingestion a green status is not proof the data is right.

Step 1 — Did the refresh even run?

Before anything about counts, prove the run happened. The most common "ingestion bug" isn't bad data — it's no new data, a stream that silently stopped refreshing while every downstream consumer kept reading the last good load as if it were current. A stale DLO looks exactly like a fresh one; nothing in a segment says "this is three days old."

The check — open the stream and read its last-run status and timestamp. When did it last complete, and did it complete or fail? A run that's failing on a schedule, or a last-run timestamp far older than the cadence promises, is the whole bug — there's no new data because nothing landed. Then check the two things that stop a run: the schedule (is it actually enabled and set to the cadence you think, or paused?) and the connector authentication (has the credential, token, or key expired?). An expired connector credential is the classic silent stop: the stream was working, the token lapsed, runs began failing, and downstream it presents identically to "the source had no new records."
The symptom — a count that's flat when you expected growth, data that's correct but stale (right for a moment in the past), or a last-run timestamp that hasn't advanced. For a file-storage source, "no new data" can also mean the upstream export stopped dropping files into the bucket — the stream ran fine and found nothing, which is not the same failure but presents the same way (see connectors).
The fix — re-authenticate the connector if the credential lapsed, re-enable or correct the schedule if it was paused or set wrong, and for a file source confirm the upstream export is still landing files. Then force a refresh and go to Step 6 to confirm it landed. Don't debug counts until you've confirmed a run actually completed against current source data — a wrong number from a run that never happened is not the bug you think it is.

If the run completed against current source data, the data is fresh and the bug is in what it landed. Go on.

Step 2 — Are the row counts what you expected?

The run completed, but the count is wrong — too high, too low, or it moved by an amount that makes no sense. Before chasing individual records, read the DLO's row count and compare it to what the source actually holds, because the most common count bug is a full-refresh-versus-upsert confusion: a mismatch between what you think the run did to the dataset and what the mode actually does. A full refresh replaces the set; an upsert merges into it (refresh modes). Confuse the two and the count surprises you.

The check — first, count what's in the DLO now. Then ask which mode the stream is on and whether the count matches that mode's behavior. A full refresh should leave the DLO holding exactly what the source holds this run — no more, no less. An upsert should leave it holding everything ever sent that wasn't superseded — which grows over time and can be larger than the current source if the source has since shrunk. The simplest count on the landing object:

-- How many rows are in the DLO right now? Run after a refresh and compare to what
-- the source actually holds this run. For a FULL REFRESH these should match the
-- source; for an UPSERT the DLO can exceed the source (it accumulates, see Step 4).
SELECT COUNT(*) AS dlo_row_count
FROM YourSource__dll;

The symptom — two shapes, opposite causes. A count lower than the source after what you assumed was an additive load usually means the stream is on full refresh and replaced the set, when you expected it to add to it — the previous rows weren't kept, they were overwritten. A count higher than the current source usually means the stream is on upsert and has been accumulating rows the source no longer holds (Step 4's delete problem), or the same logical record is landing under different keys and duplicating (Step 3). Either way, the count is "wrong" only relative to the mode you assumed; against the mode the stream is actually on, it's behaving exactly as designed.
The fix — reconcile your expectation with the configured mode before you change anything. If you expected incremental accumulation and got replacement, the stream is on full refresh — decide whether that's actually wrong (full refresh is often the safer mode; see Step 4 and refresh modes) or whether you need upsert with a real key. If you expected the DLO to mirror the source and it's larger, the stream is on upsert and either retaining deleted records (Step 4) or duplicating on a bad key (Step 3) — go name which. The count is a signpost to the next step, not the bug itself.

Once the count's direction tells you whether you're looking at a missing-records problem or a too-many-records problem, go to the matching step. Go on.

Step 3 — Records missing or duplicated?

The count pointed at individual records: specific rows that should be in the DLO aren't, or the same logical record appears more than once. On an upsert stream both symptoms trace to the same root — the primary key. The key is what tells an upsert whether an incoming record is the same one it already has (update it) or a new one (insert it). A key that isn't genuinely unique, or isn't genuinely stable, breaks that decision silently.

The check — find out whether the key is doing its job by counting rows per key value on the DLO. A primary key should be unique: exactly one row per key. More than one means the key isn't unique — two real records share a value, or the same record landed twice under a key the stream treated as new each time.

-- Is the upsert key actually unique on the DLO? Group by the key and find any value
-- that appears more than once. Zero rows returned = the key is unique (healthy).
-- Any rows = duplication: a non-unique key, or the same record re-inserted as "new".
SELECT
  YourKeyField__c        AS key_value,
  COUNT(*)               AS rows_for_this_key
FROM YourSource__dll
GROUP BY YourKeyField__c
HAVING COUNT(*) > 1
ORDER BY rows_for_this_key DESC;

The symptom — duplicates: the same person or record appears several times in the DLO, each as a separate row, because a key that isn't stable made each arrival look new (you keyed on a field that changes, so the upsert inserted instead of updating). Or silent loss: a record you know was sent is missing, because a non-unique key let one incoming record overwrite a different real record that happened to share the key value — one upsert quietly replaced the other and no error fired (refresh modes). Duplication inflates the count; the overwrite kind of loss can leave the count looking fine while a specific record is simply gone.
The fix — the durable repair is at the key, not at the rule that reads it. If the key isn't unique, you keyed on the wrong field — find a field (or composite) that is genuinely one-per-record and re-key the stream (relationships & keys). If the key isn't stable — it changes between runs for the same record — the same fix applies: choose a field that's constant for the life of the record. Patching the symptom (deduping downstream, ignoring the extra rows) just moves a known-wrong ingest into every consumer. The key is the contract; if it's broken, fix the contract.

If the keys are unique and stable and records are still wrong, the problem isn't duplication or loss on insert — it's that records which should be gone are lingering. That's a different failure. Go on.

Step 4 — Records that should be gone are still there?

A specific kind of "wrong count, too high" deserves its own step because it's the central correctness trap of ingestion and it confuses everyone the first time: records deleted from the source are still in the DLO, looking exactly as valid as live ones. If the stream is on upsert, this is not a bug in your data — it's the documented behavior, and the fix is a strategy, not a patch.

The check — confirm the symptom is retention of deletes, not duplication. Pick a record you know was removed from the source — an account that was closed, a contact that was deleted — and look for it in the DLO. If it's still there, and the stream is on upsert, you've reproduced the behavior. The general shape: the DLO holds more distinct records than the source currently does, and the surplus is records the source no longer has.

-- Is a known-deleted source record still living in the DLO? Replace the literal with
-- a record you KNOW was removed at the source. On an UPSERT stream it will still be
-- here — upsert never removes a record just because the source stopped sending it.
SELECT *
FROM YourSource__dll
WHERE YourKeyField__c = 'KEY_OF_A_RECORD_DELETED_AT_SOURCE';

The symptom — a customer who unsubscribed or was deleted is still segmented and still gets contacted; "active" counts that never go down even though the source loses records; a DLO that only ever grows. Because everything downstream reads the DLO, the stale record flows into identity resolution, segments, and activations, and someone who should have been removed is reached — with no error anywhere, because the run that should have removed them simply never touched them. An upsert touches a record only when it's in the run, and a deleted record is, by definition, the one case that stops arriving.
The fix — this is the half of the refresh-mode decision that bites latest: upsert does not remove deleted source records unless you explicitly send deletes (refresh modes). If the source can delete records, you need a deliberate delete strategy — a way for the deletion to reach Data 360 as an explicit delete signal — or you switch the stream to full refresh, which captures deletes by absence (a record gone from the source this run is simply not in the new load, so it's gone from the DLO too). Choose: send deletes, or full-refresh. There is no third option where upsert quietly cleans up after itself.

If records land, don't duplicate, and leave when they should, the rows are correct — but a row can land and still be partly rejected if its values don't match the DLO's expected shape. Go on.

Step 5 — Mapping/validation failures at the DLO?

The rows are right in number and identity, but a field on them is empty or wrong for every record — and it was empty on landing, before any modeling. This is the ingestion-side validation failure: the incoming data didn't conform to the DLO's field definitions, so a field was dropped, blanked, or coerced. It is distinct from DLO→DMO mapping (that's the next layer, Data Architecture's); here the question is narrower — did the value survive the landing into the DLO at all?

The check — read the suspect field straight off the DLO, at rest, before anything downstream touches it. Is it null or wrong for every row, or only some? A field that's blank across the board points at the field never arriving (a name mismatch between the payload and the registered schema) or a type the DLO rejected; blank for some rows points at source data that fails validation only for those records (a malformed date, a number where text was expected). Read it as it landed:

-- Is the field populated on the DLO, at rest? Count how many rows have it null.
-- A high null count for a field the source always sends points at a landing/validation
-- failure (schema/name mismatch or a rejected type), not a downstream mapping bug.
SELECT
  COUNT(*)                                              AS total_rows,
  SUM(CASE WHEN YourField__c IS NULL THEN 1 ELSE 0 END) AS null_rows
FROM YourSource__dll;

The symptom — for a connector source, a field that's consistently empty on the DLO usually means a schema/type mismatch the connector couldn't reconcile; for an Ingestion API source, it usually means the payload didn't match the registered schema — a field named differently than the schema declares, or typed differently than registered, is rejected or silently dropped rather than coerced (the Ingestion API). An Engagement stream missing its event-time field is a specific case worth naming: without the timestamp the records aren't a usable time series, and the failure shows up downstream as time-windowed logic with nothing to stand on (data streams).
The fix — fix it at the boundary the data crosses. For an Ingestion API source, reconcile the payload to the registered schema (or re-register the schema if the source's shape genuinely changed) so field names and types match exactly — the schema is the contract and the data must conform to it, not the other way around. For a connector, correct the field/type mapping the connector offers at ingestion. What you do not do is paper over a landing failure in the DMO mapping downstream — a field that never landed can't be mapped, and reshaping it later just hides where it actually broke.

Once the right rows land with their fields intact, the data is correct at the DLO. The last step isn't a new failure — it's how you prove any fix above actually took. Go on.

Step 6 — Verify a refresh actually landed

You changed something — re-authenticated a connector, re-keyed a stream, switched to full refresh, fixed a schema — and forced a run. Now confirm the fix took, the same way you'd confirm any ingestion claim: not by reading the status badge, but by reading the row counts on the DLO before and after, against the run's last-completed timestamp. A green "refresh complete" tells you the engine ran; it says nothing about whether the rows are now right.

The check — capture the DLO count before the run, force the refresh, wait for the run to report complete, then re-count and compare. Predict the direction first: a switch to full refresh that drops stranded deletes should lower the count to match the source; a re-key that stops duplication should lower it toward one row per record; a re-authenticated connector that resumes a stalled stream should raise it to reflect data that wasn't landing. Then read the last-run timestamp to confirm the count you're reading is from the run you just forced, not the stale one.

-- BEFORE and AFTER a fix: the DLO row count for the stream you touched. Capture this
-- before forcing the refresh, then re-run after the run reports complete. The count
-- moving in the direction you predicted is the verification — not the green status.
SELECT COUNT(*) AS dlo_row_count
FROM YourSource__dll;

The symptom that you're reading too early — you re-count and the number hasn't moved, or moved partway. A refresh isn't instantaneous; the DLO you query right after forcing a run may still be landing, so an "unchanged" count can simply be mid-flight. Equally, a status that reads complete while the count didn't move at all means the run didn't do what you changed it to do — the schedule fired but the credential is still bad, or you re-keyed the wrong field — and you return to the step that owns that fix.
The fix — treat the DLO count and the last-run timestamp together as the source of truth, and re-confirm on the specific records from the earlier step, not just the total. Re-run the Step 3 per-key count: are the duplicates gone? Re-run the Step 4 lookup: did the known-deleted record finally leave (or is it gone because full refresh dropped it by absence)? The total moving in the predicted direction is necessary; the specific records being right is sufficient. And remember the count moves every downstream number that reads this DLO — a refresh that finally lands deletes will shift segment sizes and any report keyed on it, so tell whoever reads those numbers before they move under them (principle 11).

A diagnostic you can run

When the report is "ingestion looks wrong" and you don't yet know which failure you have, the fastest triage is two reads, by eye — no fix, no guessing at the mode. They answer the two questions that split every ingestion bug into its branch.

When did this stream last successfully run, and against what? Read the last-run status and timestamp on the stream. A failing run, or a timestamp older than the cadence promises, is the bug — it's Step 1, a refresh that stopped (paused schedule, expired connector auth, an upstream export that stopped dropping files). No new data looks exactly like fresh data, so this is always the first read.
Does the DLO row count match the mode the stream is on? Count the DLO and compare to what the source holds. Lower than the source after an assumed-additive load is full-refresh replacement (Step 2); higher than the source is upsert accumulation — either retained deletes (Step 4) or duplication on a bad key (Step 3). The direction of the gap routes you to the exact step.

The direction you find dictates everything after it: a stream that didn't run is Step 1, a count too low is replacement you didn't expect, a count too high is deletes that never left or a key that duplicates. A DLO holding a customer who unsubscribed last week, on a stream you confirm is on upsert, is the delete trap localized — no failed run to hunt, just a delete strategy you don't yet have.

Common symptoms mapped to steps

Symptom	Likely cause	Where to look
Count flat when you expected growth	Refresh stopped — paused schedule or expired auth	Last-run status and timestamp (Step 1)
Data correct but stale (yesterday's)	Stream stopped refreshing; serving the last load	Last-run timestamp vs. cadence (Step 1)
Count lower than the source	Full refresh replaced the set you expected to grow	Configured mode vs. your expectation (Step 2)
Count higher than the current source	Upsert accumulating — retained deletes or duplicates	Per-key count (Step 3), known-deleted lookup (Step 4)
Same record appears several times	Non-stable key — upsert inserts instead of updating	Rows per key value on the DLO (Step 3)
A record you sent is simply missing	Non-unique key — one upsert overwrote another	Rows per key value on the DLO (Step 3)
Unsubscribed/deleted record still present	Upsert does not delete without an explicit signal	Known-deleted record lookup (Step 4)
One field blank for every row, at landing	Schema/type mismatch or payload off the schema	The field, null-counted on the DLO (Step 5)
Fixed it, but the count didn't move	Reading too early, or the run didn't apply the fix	Re-count vs. last-run timestamp (Step 6)

Data streams — the unit of ingestion every probe on this page reads: source → DLO, the category, and the schedule whose status you check first
Refresh modes — full refresh vs upsert and the deletes behavior behind Steps 2, 3, and 4
Ingestion gotchas — the silent ingestion failures this page diagnoses, in production form
Debugging query results — when the wrong number is a query bug, not an ingestion one (the layer above this)

Reference: