Five Decisions That Determine Whether Your Adobe Experience Platform Implementation Scales or Plateaus

Five Decisions That Determine Whether Your Adobe Experience Platform Implementation Scales or Plateaus

By Stephen Tettey | stephentettey.com


There is a version of Adobe Experience Platform that lives in demo environments and sales decks. The data flows cleanly. The identity graph resolves without friction. Audiences qualify in real time. Journeys fire on cue.

Then there is the version that lives in production.

After five years implementing AEP, Real-Time CDP, Journey Optimizer, Adobe Analytics, and Customer Journey Analytics across industries from financial services to travel and hospitality to loyalty programs, I have seen the gap between those two versions up close. The mistakes that create that gap are rarely technical in nature. They are architectural. They are strategic. And almost without exception, they are made before the first dataset is ever ingested.

This post is not a critique of any team or any client. Every one of these situations involves smart people making reasonable decisions with the information they had at the time. What follows is what I now know, and what I wish I had articulated more clearly at the start of each engagement.


Lesson 1: Schema design starts with use cases, not data sources

The instinct on most implementations is to start discovery by asking: what data do we have, and where does it live? It feels productive. It feels grounded. It is the wrong starting point.

The right question is: what do we need to be able to do? What are the use cases this platform needs to support, not just today, but over the next two to three years? Only after you have a clear answer to that question should you turn your attention to what data is needed to support those use cases, which naturally leads to understanding what data exists, what needs to be created, and what ingestion patterns make sense.

This distinction matters because the type of data you need determines the class of schema you should design. Behavioral and event-based data belongs in an ExperienceEvent schema. Profile attributes belong in an Individual Profile schema. Conflating the two, or designing schemas that try to serve both purposes, creates a data model that is brittle under the weight of new requirements.

What can go wrong: On one engagement, the team ran a thorough discovery process and understood the client's use cases well. But when it came to schema design, the focus shifted to MVP scope and available data sources. Schemas were built to solve for the immediate use cases using the data that was readily on hand.

When the time came to scale the architecture for additional use cases, the team discovered that simply adding attributes to existing schemas did not make sense for the new requirements. But creating entirely new schemas introduced redundancies and unforeseen conflicts with existing data flows. The rework was significant and avoidable. A more comprehensive schema design in the beginning, one that accounted for anticipated use cases even if the data to support them was not yet available, would have cost a fraction of the time.

The lesson: design your schema for where you are going, not just where you are.

💡
"Design your schema for where you are going, not just where you are."

Lesson 2: Identity resolution and identity management are not the same thing

These two terms are used interchangeably in most implementation conversations. They should not be. Understanding the difference, and deciding where resolution actually needs to happen, is one of the most consequential architectural decisions you will make.

Identity management is the process of maintaining and governing the identifiers associated with a customer across systems. Identity resolution is the process of determining that two or more identifiers belong to the same person. They are related but distinct, and resolution does not have to happen in the CDP.

Resolution can happen in an MDM system, a consent management platform, a CRM, the web data layer, or AEP itself. In many enterprise environments, it happens in a combination of all of the above. The question is not whether AEP can do it. The question is whether AEP is the right place to do it for each data source and use case in scope.

As a general principle: resolve identities in the CDP for data that is unique to the CDP. For everything else, understand where resolution already happens or should happen upstream, and design your identity namespace strategy accordingly.

What can go wrong: One client made a firm architectural decision that all identity resolution would happen outside of the CDP, for every use case and every data source. The intent was to maintain a single source of truth in their existing identity provider. The consequence was not immediately obvious.

Web behavioral data was ingested into AEP, then sent back out to the external identity provider for resolution, and then re-ingested into AEP with the resolved identity attached. The round trip introduced latency that made real-time use cases impossible. What was architected as a real-time personalization platform defaulted entirely to batch processing because the identity resolution loop could not keep pace with the speed the use cases required. The real-time capability they had purchased and built toward was effectively neutralized by a single upstream architectural constraint.

The lesson: where identity resolution happens is not a technical detail. It is a use case constraint. Understand it before you design anything else.

💡
"Where identity resolution happens is not a technical detail. It is a use case constraint."

Lesson 3: Data governance is a foundational pillar. It should not be ignored.

Every implementation team knows governance matters. Very few treat it as a foundational concern from day one. It tends to get scheduled for later, after the data is flowing, after the use cases are running, after there is something to govern. By then, the cost of retrofitting governance is high and the exposure is already real.

Governance in AEP is not a single decision. It is a set of interconnected decisions that affect nearly everything else: how data is labeled using DULE policies, how datasets are designed and retained, how TTL is configured for both event and profile data, how dynamic datastreams route data to the right datasets, how PII is handled across destinations, and how compliance with GDPR, CCPA, and industry-specific regulations is enforced programmatically rather than manually.

All of these decisions interact. A TTL decision affects dataset design. A dataset design decision affects ingestion patterns. An ingestion pattern decision affects profile stitching. And governance that is not enforced in the platform is governance in name only.

What can go wrong: One client wanted to retain 13 months of event data in AEP to support specific analytical use cases. That requirement was honored in the implementation. What was not accounted for was the cumulative data volume that 13 months of event retention would generate against their license entitlement. The client hit their total data volume limit well ahead of projections.

The same client was regularly deleting aged profiles from their CRM system as part of normal data hygiene. Those deletions were never communicated to AEP. The profiles continued to exist in the CDP long after they had been removed from the source of record, accumulating against the license and holding PII that should no longer have been retained.

And separately, DULE data usage policies had not been configured to restrict PII from flowing to a specific destination. The data flowed anyway. What makes this particularly worth understanding: even if the policies had existed, they would not have blocked anything on their own. Adobe's documentation confirms that all data usage policies (including core policies provided by Adobe) are disabled by default. Each policy must be manually enabled through the UI or API before it is considered for enforcement. Creating a policy is not the same as activating it. The enforcement mechanism was not in place to stop it, and even a configured but unenabled policy would not have been.

Three distinct governance failures, on the same engagement, all of which were preventable if governance had been treated as a design constraint from the start rather than a configuration task to complete later.

The lesson: bring a data steward into the room before the first schema is designed. Governance decisions made late are expensive to make right.


Lesson 4: Sandbox strategy is an architectural decision. Don't treat it as an afterthought.

Adobe provisions five sandboxes out of the box. That number tends to create a false sense of abundance. Teams see five sandboxes and assume the allocation question is trivial. It is not.

One of the most overlooked steps at the start of an engagement is determining how many sandboxes the implementation will actually require beyond the standard five. This analysis needs to happen early, before contracts are signed and before a single sandbox is provisioned. The reason is straightforward: additional sandboxes beyond the default allocation are a licensable item.

If the architectural assessment reveals that a multi-entity deployment requires twelve sandboxes rather than five, that delta needs to be identified, budgeted for, and negotiated into the procurement contract upfront. Discovering the shortfall six months into an implementation, when business units are already operating and data is flowing, puts the team in a difficult position. The cost is no longer just financial. It affects timelines, re-architecture decisions, and sometimes the viability of entire use cases.

Doing this analysis early serves three purposes. It forces the right architectural conversation before anyone has committed to a structure that cannot easily be undone. It gives the business a realistic picture of what the platform will cost to operate at the required scale. And it gives the procurement team what they need to include the right sandbox entitlements in the contract from day one.

The first thing to understand about the sandboxes you do have is that not all production sandboxes are equal. There is always one primary production sandbox, and it cannot be deleted. It can be reset, but it cannot be removed from the organization. Additional production sandboxes, by contrast, can be deleted entirely, which changes how you think about their role and lifecycle in your architecture.

This distinction becomes especially significant in organizations that deploy multiple production sandboxes, with one sandbox per business unit or division. In that model, each business unit effectively operates its own isolated production environment within the same AEP organization. The primary sandbox carries permanent status regardless of how the organization evolves, so the decision about which business unit or use case it is assigned to at the outset is not easily reversed. Additional production sandboxes assigned to other divisions can be deleted and reallocated if business needs change, giving those environments a degree of flexibility the primary sandbox does not have. Understanding this asymmetry before you begin assigning sandboxes to business units prevents a situation where the wrong environment ends up with permanent, undeletable status.

The second thing to understand is that sandbox strategy is driven by organizational structure, not just technical requirements. If AEP is being deployed for a single corporate entity, the answer looks different than if it is being deployed for multiple sub-entities within a larger organization. In the latter case, you need to ask whether those entities plan to share data, whether regulations such as HIPAA or financial services rules prohibit such sharing, and how identity graph linking rules should be configured to prevent profile collapse across entities.

The B2C, B2P, and B2B SKU the client has purchased also materially affects sandbox strategy. The data models, identity approaches, and use case patterns differ significantly across those SKUs, and the sandbox architecture should reflect those differences.

What can go wrong: The most common mistake I see at the start of an engagement is provisioning for only a Dev sandbox and a Production sandbox, with no dedicated UAT environment.

When UAT is conducted in the Dev sandbox, development and testing are happening in the same environment simultaneously. A failed test case cannot be reliably retested because changes may have been introduced between the first run and the retest. New development activity creates new variables that contaminate test results. Issues that appear during testing cannot be cleanly isolated from issues caused by in-flight development work.

On more than one engagement, this approach produced a go-live with open questions about whether the failures encountered during testing were product defects or environment contamination. A dedicated UAT sandbox eliminates that ambiguity entirely. It is worth the allocation.

The lesson: plan your sandbox architecture around your organizational structure, your SKU, your regulatory constraints, and your development lifecycle. Default provisioning is a starting point, not a strategy.

💡
"Sandbox strategy is not an IT configuration task. It is an architectural decision that reflects your organizational structure, your regulatory constraints, and your development lifecycle."

Lesson 5: Mocked data in UAT will lie to you

When production-quality data is not readily available during implementation, teams reach for mocked data. It is an understandable response to a real constraint. The business wants to see the platform working. The team needs something to test against. Mocked data fills the gap.

The problem is that mocked data is, by definition, clean. It is structured consistently. The identifiers are well-formed. The values are within expected ranges. The edge cases that real data surfaces (malformed records, unexpected nulls, case sensitivity conflicts, volume-driven latency, identity collisions that only appear when real customer behavior is represented) are absent from a mocked dataset entirely.

The approach that consistently produces better go-live outcomes is to point production data sources at the UAT sandbox for ingestion testing. This is not always straightforward to arrange, particularly in organizations with strict data access controls. But the effort to make it happen is worth it. What you see in UAT with real production data is a genuine preview of what you will encounter in production. Nothing else is.

What can go wrong: On one engagement, UAT was conducted with a well-structured, cleanly defined mocked dataset. Testing passed. Edge cases were accounted for within the scope of what the team had imagined. The implementation went to production.

The production data was raw in ways the mocked data had not anticipated. Inconsistencies in identifier formatting, records that violated assumptions baked into the schema design, data quality issues that rippled into profile stitching and audience qualification. The issues were not insurmountable, but they required significant remediation work after go-live. Production-quality UAT data would have surfaced all of it before deployment.

The lesson: mock data tests your implementation against your assumptions. Production data tests your implementation against reality. Only one of those matters on go-live day.


The through-line

Each of these five lessons looks like a different problem. Schema design, identity strategy, data governance, sandbox architecture, UAT methodology. But they share a single root cause.

Every one of them reflects a decision made too early based on too narrow a frame. The schema designed for today's data rather than tomorrow's use cases. The identity architecture optimized for organizational preference rather than use case requirements. The governance framework deferred until there was something to govern. The sandbox allocation defaulted to what was convenient rather than what the implementation lifecycle actually required. The testing environment built on what was available rather than what was real.

AEP is a powerful platform. But its ceiling in any given implementation is set by decisions made before the first record is ingested. Getting those decisions right, or at least getting them less wrong, is what separates implementations that scale from implementations that plateau.

More posts on each of these topics are coming. If there is a specific area you want me to go deeper on first, subscribe and let me know.


Stephen Tettey is an Adobe Certified Multi-Solutions Architect with five years of experience implementing Adobe Experience Platform, Real-Time CDP, Journey Optimizer, Adobe Analytics, and Customer Journey Analytics across enterprise environments. He writes at stephentettey.com.