Every data platform makes an implicit claim: that the data it ingests is clean enough, fresh enough, and coherent enough to produce useful intelligence. Most platforms make this claim without examining it. They ingest whatever their connectors can reach, batch-process it on a schedule that was set when the system was configured and never revisited, and deliver outputs to analytics and modelling layers that quietly degrade when the input quality changes.
The problem is that raw customer data is none of those things. It is messy, stale, fragmented across sources that were never designed to be unified, and arrives in schemas that differ not just between organisations but between systems within the same organisation. A single customer's behaviour might be partially visible in an Edge SDK feed, a web analytics system, a payments processor, a CRM, and a network event log — each with different identifiers, different event granularities, different latency characteristics, and different privacy requirements. No individual source gives a complete picture. Unifying them naively produces a picture that is wrong in ways that are difficult to detect.
Deep Signal is Intent HQ's response to this problem: a unified signal middleware that treats data quality, freshness, privacy, and coherence as first-class engineering requirements rather than operational aspirations.
The architecture is built around Apache Kafka as the event backbone. Kafka's core guarantee — exactly-once delivery, at infinite horizontal scale — is the foundation on which everything else rests. Every event, from every source, arrives at the same bus and is processed in the same pipeline. There is no separate fast path for Edge SDK signals and slow path for CRM batches. Everything flows through the same governed infrastructure, which means the latency and quality guarantees apply uniformly.
The pipeline from raw event to enriched feature follows a standard Medallion architecture: Bronze (raw, unvalidated), Silver (cleaned, validated, deduplicated), Gold (enriched, modelled, ready for inference). Each transition applies specific quality checks. Schema validation ensures that events conform to expected formats and flags anomalies for investigation rather than silently propagating them. Deduplication removes the duplicate events that are endemic in high-velocity feeds — a telco network event may be logged by multiple systems; the Deep Signal pipeline ensures that each real-world event is counted once. Noise injection applies the differential privacy protections that ensure outputs are mathematically anonymised before they reach modelling or analytics layers.
The 120-second freshness guarantee — from event generation to enriched feature availability — is the operational expression of a design choice: that intelligence is only useful if it describes what is happening now, not what happened this morning. Most enterprise data pipelines operate on cycle times of hours to days, calibrated to the assumption that customer behaviour is slow-moving and that overnight batch processing is sufficient. For intent detection, this assumption is wrong. A customer's micro-intention state — the specific configuration of signals that indicates they are ready to act — may persist for hours or may dissipate in minutes. A system that processes signals in four-hour batches will identify that state after it has already passed.
Identity stitching is the hardest problem Deep Signal solves, and the one that receives the least attention in how data pipelines are typically discussed. A single customer leaves traces across multiple systems: an Edge SDK ID on their mobile device, a cookie identifier on the web, a customer number in the CRM, an MSISDN in the network log. These are four different identifiers for the same person, and they rarely appear together in any single event. Stitching them into a unified profile requires probabilistic matching: using co-occurrence patterns, temporal correlations, and behavioural similarity to infer that four identifiers belong to the same individual, without access to a deterministic linking key.
Intent HQ's identity resolution operates across all source types simultaneously. A customer who switches from their app to their mobile browser and back does not become three separate customers in the intelligence system. Their unified profile reflects the full cross-channel behaviour, which is the only basis on which accurate intent detection is possible.
The Unity Catalog layer provides the governance infrastructure that makes all of this auditable and controllable. Every data asset has a lineage record: where it came from, how it was transformed, what quality checks it passed, who has access to it, and when it was last refreshed. Any change to the pipeline — a new source connector, a modified transformation, a change in privacy parameters — is recorded and can be rolled back instantly. In a regulated environment where data provenance is a compliance requirement, this is not a nice-to-have. It is a requirement.
The analogy that best captures Deep Signal's role in the architecture is a water treatment plant. Raw data, like raw water, contains value — but also contaminants, variability, and hazards that make it unsuitable for direct consumption. The treatment plant does not create the water. It receives whatever arrives, applies a systematic purification process, and delivers something clean, safe, and consistent to every downstream system that depends on it. The quality of every model trained on Deep Signal features, every inference made by Intent AI, every activation executed by a Marketing Agent, depends on the quality of the signal that Deep Signal delivers. That is why the engineering investment in the pipeline is not support infrastructure. It is the foundation of everything.