← back

Vikram Chauhan

I build modern intelligence capabilities and write about AI-native operating models, enterprise strategy, leadership, and the future of software and data teams.

So, You Want to Do Personalization

May 16, 2026

So, You Want to Do Personalization

Every executive I've worked with has asked for personalization. Almost none of them meant the same thing.

For some it is a first-name token at the top of an email. For others it is a recommendation widget on the website. For most it is something bigger and vaguer: every customer getting a unique experience tailored to who they are, what they care about, and what they have done with us. People often have all three in mind at once and assume they are roughly the same thing. They are not.

Let me describe what real personalization is to me, and then walk through what I believe it actually takes to do it. I will use one example the whole way through.

Imagine a customer named Sarah. She signed up two years ago and bought steadily for about a year. She made a purchase for her son. Then the orders stopped, and we have not heard from her since.

What would it take to send Sarah a message that wins her back, written for her and not for everybody?

It sounds like one task. It is actually six things that all have to be true at the same time. Almost any of them being slightly wrong means the message either does not happen, happens to the wrong person, happens at the wrong time, or happens in a way that feels generic anyway.

First, we have to know it is Sarah. That means two things. The first is capture. Her purchase, her email signup, and her store visit each have to be recorded at the moment they happen. We may not always. Leads from a partner promotion can die in a vendor portal. App events can be tracked for analytics but never make it to the customer record. Forms can ask the wrong questions, or no questions at all. Once a moment passes, it does not come back unless we ask the customer again.

Then we have to stitch what we did capture into one Sarah. Her purchase record, her email, her support history live in different systems and use different keys. Whether she shows up in our data as one person or three depends on whether we have linked those records. This sounds simple. It is not. The same customer often appears in five or six places under slightly different spellings, with different phone numbers, different emails over time, sometimes a maiden name and a married name. Without that linking, we send three messages to the same person on the same day and call it bad luck.

Second, we have to know enough about Sarah. Her name and email are not enough to write something that feels personal. Does she still buy, just on someone else's account or as a gift? Did her son age out of the thing she bought for him? Has she moved? Does she engage with our other product lines, not just the one she started with? The richness of what we know about her is the ceiling on what we can credibly say to her. Across the millions of customers in our database, often only a tiny fraction of the file has enough information attached to support a real personal message. Everyone else gets a name in the greeting and the same content as the next person on the list.

Third, we have to be allowed to talk to her. Sarah may have unsubscribed from marketing emails. She may have opted in to transactional notifications but not promotions. If we send her something she did not agree to, we are out of compliance and she loses trust in us. Today, the systems that hold her permissions may not always agree with each other. She may have unsubscribed in one place while another keeps sending. This is a legal problem and a customer-trust problem at the same time, and it gets worse as the database grows.

Fourth, something has to decide what to say to her. Even with a clean profile, full attributes, and the right permissions, someone or something has to choose Sarah's message. A discount on a bundle? An invitation to a flagship product launch? A family-oriented offer built around the purchase she made for her son? Today, that decision is often a campaign manager picking what to send to a list this week. Sarah is on the list or she is not. The decision was made for the list. It was not made for Sarah.

Fifth, the message has to actually exist. Whatever was decided, the creative has to be ready. The image. The copy. The right legal disclosure for the offer. Most teams produce a handful of versions of each campaign because that is what their workflow supports. Real personalization needs hundreds of versions, ready to be assembled on the fly. If the decision is "Sarah should get a family-oriented offer with a warm, household image" and there is no such creative in the library, the system falls back to whatever generic version exists. That is what almost always happens.

Sixth, we have to send it and watch what she does. Sarah opens, clicks, ignores, or unsubscribes. Whatever she does has to flow back to the part of the system that decided to message her in the first place, so the next decision is better than this one. If nothing flows back, the program never learns. The same things keep happening regardless of what works and what does not.

That is what one personalized message requires. Six things, all true at once, in roughly that order, with no shortcuts. When an organization says it wants personalization, what it is really saying is that it wants all six of those things to work together for every customer, every day, at the pace its marketing runs at. That is a much bigger commitment than most people realize when they ask.

It is the same shape in every business with a recurring customer relationship. A streaming service's Sarah is the subscriber who watched every week for a year and then stopped opening the app. A bank's Sarah is the cardholder who quietly moved her everyday spend to another card. A retailer's Sarah is the loyalty member whose basket shrank to nothing. A health system's Sarah is the patient who stopped booking her annual visit. A B2B software company's Sarah is the power user who churned three seats before anyone noticed. Same six links. Same failure modes. The model does not change. Only the vocabulary does.

This is why the industry sells personalization as a software purchase. Writing a check is easier than doing six hard things. It is also why those purchases so often disappoint. Buying a customer database does not give you any of the six. It gives you a better place to put the data. Everything else is real work that real people have to do.

I am not writing this to discourage anyone. Personalization is worth doing. It changes what we can offer customers, what we can sell partners, and how we keep the people who value what we do in our business for the long run. But it is only worth doing if we are honest about what it takes. Picking the wrong starting point is how organizations spend three years and end up with a fancier first-name token. Picking the right starting point is how they end up sending Sarah the message that brings her back.

If someone asks me where to start, I do not tell them to buy a platform. I tell them to find the weakest of the six and fix it. Then the next weakest. The chain only gets stronger one link at a time. There is no version of this where one purchase makes it all work.

That is the story worth telling honestly. The rest of this is what it takes to live it.

OK, so how do we do this?

None of it is theoretical. The pieces exist. Some you already have. Some you have to build. Some you have to fix. Here is the order I would do it in.

I would start with collection and identity. Every other piece depends on knowing it is Sarah. That means three things in parallel.

First, instrument every transactional touchpoint so identifiers are captured the moment a customer acts. Email at every form. Phone at every point of sale. Account ID at every login. Source attribution on every record. This is the active capture work, the moments where the customer explicitly gives us something.

Second, invest in telemetry. This is the passive behavioral signal we are usually barely collecting, and it is most of what we lack. The surfaces are everywhere a customer engages with us.

Owned digital first. The website, the store, the checkout flow, the support center. Every page view, scroll, session, internal search, and form abandonment is signal. The mobile app should be instrumented for opens, screens, session length, frequency, in-app purchases, and push notification response. Offline touchpoints layer on top: in-store visits, point-of-sale, loyalty redemptions, in-location app engagement. All of it tells us how she uses our properties and where she spends her time with us.

Social and content next. Instagram, TikTok, X, YouTube, Facebook. Follows, likes, comments, shares, video completion, story views, saves. Content engagement is the most direct read on what she cares about: which topics she reads about, which videos she watches, which content she comes back for, which series she finishes. Email and SMS data should go beyond open and click to the topic and link level: which subjects get her attention, which she ignores.

Then the long tail. Ad platform pixels (Meta, Google, TikTok, X) wired for conversion, retargeting response, and frequency. Connected TV and OTT engagement as cord-cutting continues. Customer service interactions (calls, chats, support tickets) and live participation (sweepstakes entries, surveys, partner activations). Each surface is a customer telling us something we are not currently listening to.

All of this requires modern tracking infrastructure, and in most organizations much of it is broken or partial today. Typical pixel tracking is client-side and increasingly unreliable as browsers restrict cookies and identifiers. The direction is server-side: capture the event on your own infrastructure first, then push it where it needs to go. It is the difference between continuing to lose signal and finally collecting what is already there.

Most of this signal is the psychometric layer, the part of who Sarah is that she does not explicitly tell us. Her purchases tell us what she did. Her telemetry tells us what she cares about. Without it, we know what she bought but not why, and we cannot predict what she will respond to next. That gap is most of what stands between a generic message and a personal one.

Third, invest in a real identity-resolution capability that uses deterministic matches first, probabilistic matches second, and produces a confidence score on every link. Matching that relies heavily on fuzzy name and email lands well short of where it needs to be: good enough to be tempting, not good enough to trust. It has to climb, and the confidence score is what lets the business decide whether a match is good enough to email, call, or honor a legal opt-out.

Identity resolution has to span both the transactional records and the telemetry. Sarah's anonymous web session has to link to her account record when she logs in. Her social engagement has to link to her profile when she connects an account. Telemetry without identity resolution is noise. Identity without telemetry is a list of people we cannot personalize to.

Consent comes next, and in parallel. Pick one system as the source of truth for what Sarah has agreed to. Make every other system (CRM, marketing platform, commerce platform, app, paid advertising audience syncs) check it before sending anything. Reconcile at send time, not overnight. Every change has to be logged with the source and timestamp, so we can prove what we did and when we did it.

Then expand what consent actually means. Today most organizations capture a binary. She opted in to marketing or she did not. The richer version is layered. Channel consent for email, SMS, push, direct mail, and advertising via PII match. Topic consent for the product lines, content categories, transactional alerts, premium offers, promotions, partner activations she actually wants. Frequency preference for how often she wants to hear from us. Format preference for long-form newsletter or short transactional. The combination gives us a permission set specific enough to power personalization, rather than a single yes/no that forces us to either send too much or send nothing.

Add the telemetry side. The server-side tracking we are building needs a Consent Management Platform that handles cookie consent, identifier capture, and behavioral tracking authorization in a way that meets every applicable framework: CCPA, GDPR for international customers, COPPA for minors, and the state-level privacy laws that keep multiplying. The CMP has to share state with the rest of the consent system, not run in parallel.

Then suppression and frequency. Even where consent is granted, we need rules that prevent over-messaging. Frequency caps across channels so no more than X emails go out per week and no SMS arrives within Y hours of an email. Suppression for in-flight campaigns. Recently-purchased suppression so we do not pitch her something the day after she bought it. Do-not-contact flags from service issues. These rules sit in front of every send and override the decision engine when triggered.

Then the rights side. Every customer has the right to access what we hold on them, the right to correct it, the right to be forgotten, and the right to opt out of sale or sharing for advertising. Each of those is a workflow. A request comes in. The right system fields it. The right teams act on it. The response goes back to the customer within the legal window. We need that workflow built and audited, not improvised when the first request lands.

This work is not glamorous. It is also the legal and trust risk that grows every day it is delayed. Done right, it is also the difference between a message Sarah tolerates and a message she values, because she told us what she wanted and we listened.

Then build out what we know about Sarah. Start by defining what "complete" means by use case. The profile fit to send her a win-back email is one shape. The profile fit for an account rep to call her is another. The profile fit for a partner activation that needs her age range, household composition, and propensity score is a third. We need to map the actions the business wants to take, and for each one specify the minimum attributes required. Without that map, "complete" is a fantasy that always recedes.

Most of the completeness work is plumbing from our own first-party data. Her purchase history at the product, price, and channel level. Her store history, online and in person. Her service spend, premium tier spend, add-on spend. Her app and web engagement signal from the telemetry layer above. Her email and SMS engagement patterns. Her customer service history. Her stated preferences from forms, surveys, and the app profile she sets up. Her loyalty membership and redemption activity. Each of these often lives in a different system and is partially or fully missing from the customer record.

Then the data we cannot generate ourselves and need to bring in from outside. Demographic context: age range, household income tier, household composition, lifestyle attributes. Address validation and geocoding so we know her distance to the nearest location and the region she lives in. Identity graph data that links her to identifiers we do not have, like alternate emails, household members, and device IDs. Modeled propensity scores from third-party providers where we lack the volume or feature richness to build our own. This is the enrichment layer, and it is rented capacity. We pay per record, per refresh. The pricing model means we have to be deliberate about which segments we enrich and how often, not blanket-buy everything for everyone.

Then the derived layer. Attributes we compute ourselves on top of the foundation. Lifecycle stage (prospect, first-time buyer, casual customer, subscriber, lapsed, won-back). Persona segment (family, young professional, business buyer, high-value loyalist). Churn risk. Upgrade propensity. Lifetime value. Predicted next purchase. These are model outputs that update on a defined cadence and live alongside the raw attributes on her record. They are the ones the decision engine will actually use most often.

Underneath all of this sits the schema, the taxonomy, and the refresh logic. The customer 360 has to be a real entity model. Sarah, her household, her account, her loyalty membership, her partner relationships, the interactions she had. Not a flat table of every column we ever collected. Each attribute has a defined refresh cadence. Her email engagement is real-time. Her demographic enrichment is annual. Data quality runs continuously: address validation, email validation, phone validation, deduplication. The taxonomy is governed so lifecycle stage values are standardized, persona segment names are standardized, and interest categories are standardized. The same word means the same thing across the building.

The work raises the personalization-ready population from a tiny fraction to most of the file. That is the unlock. Everything downstream, including the decision engine, the content selection, and the feedback loop, sits on whether we did this part well.

Next, the decision engine. This is the piece most organizations are missing. The engine has two halves. Rules and models.

Rules cover the cases that do not need machine learning. Lifecycle rules: a lapsed subscriber on day 60 of inactivity, in the win-back-eligible segment, gets queued for a win-back offer. Eligibility rules: this offer is for in-market customers only, this discount applies to customers who have not bought in 24 months. Suppression rules: do not contact her if she received an email in the last 72 hours, do not pitch a renewal if she just bought one. Frequency caps. The rules are buildable in months, they are auditable, they are explicable to a marketer or a salesperson, and they handle most of the routine decisions. They are where we start.

Models handle what rules cannot. Propensity scoring: how likely is Sarah to renew, to upgrade, to churn, to buy again, to try an adjacent product line, to convert on a particular offer. Lookalike modeling: who in our unreached file looks like our most engaged customers. Next-best-action models: of the five offers we could make to Sarah right now, which one has the highest predicted lift. Price sensitivity models: at what discount level does her conversion probability cross the threshold. Recommender systems for product and content. Each model is trained on our own data, registered in a model registry, versioned, monitored for drift, and re-trained on a defined cadence. Some of these we own and accumulate (propensity, churn, lookalikes) because the IP compounds. Some we rent until ours catch up.

Underneath the models sits the feature store. The same features (Sarah's recency-frequency-monetary score, her engagement intensity, her household propensity tier) are computed once and reused across every model. Without a feature store, every model team rebuilds the same features differently and the engine becomes inconsistent. With one, the features are governed, documented, and reusable. This is infrastructure most data teams underinvest in until the third or fourth model is in production and the mess becomes obvious.

Then real-time versus batch. Some decisions can be made overnight. Sarah's win-back campaign queue can be built once a week. But some decisions have to be made at the moment of action. When she opens the app, what hero card does she see. When she lands on the site, what offer does the page render. When she abandons a cart, what email fires within minutes. The engine has to support both modes: batch decisioning for scheduled campaigns and real-time decisioning for moment-of-truth interactions, running on the same customer foundation and the same rule and model logic.

Then the experimentation layer. Every meaningful decision the engine makes should be testable. Holdout groups so we can measure incremental lift, not just response rate. A/B tests across variants of an offer or a creative. Multi-armed bandits where we can support the operational complexity. Without an experimentation framework, the engine looks like it is working because metrics are up and to the right. With one, we know which decisions are actually causal.

Then the operator interface. Marketers and partnership teams have to be able to read the engine's decisions, understand why a decision was made, override it when business context demands, and tune rules without an engineering ticket. This is the difference between a decision engine that gets adopted and one that becomes a black box that nobody trusts. The interface matters as much as the math.

And underneath all of it, the decision log. Every decision the engine made, the inputs it had, the action it took, the response that came back. This is the audit trail for compliance, the training data for the next round of models, and the explanation surface when someone asks why Sarah got a specific message.

The engine is software on top of our customer foundation, plus our own intellectual property in the models, plus the operational discipline to run it. The combination is what no platform purchase delivers out of the box.

Alongside it, restructure content for variant supply. Most content functions produce a handful of finished creatives per campaign because that is what their workflow supports. Personalization needs the opposite shape entirely. A library of modular components that can be assembled into a finished message on demand: subject lines by audience, hero images by offer type, body copy by lifecycle stage, calls to action by channel, legal disclosures by promotion. The decision engine asks for the right combination and gets back an approved, ready-to-render asset.

Building that library is part workflow and part metadata. Each asset gets tagged so the engine can find it. Audience, offer type, channel, lifecycle stage, season, partner, language. Brand standards are enforced automatically: logo treatment, voice, color palette, prohibited combinations. Legal disclosures attach automatically based on offer type and channel: CAN-SPAM for promotional email, TCPA for SMS, sweepstakes rules for contests. Asset approval workflows route the right creative to the right reviewer. Asset versioning keeps the library fresh and prevents stale offers from going out.

Then generative AI for variant production. The volume the decision engine needs is impossible to produce at human-writer pace. AI writes the first drafts: ten subject line variants for the same offer, five hero image options at different emotional registers, copy adapted for each persona. Humans review, edit, approve, and tag. This compresses the cost of variant production by an order of magnitude. The model is not AI replacing the content team. It is AI giving the content team the leverage to keep up.

Then cross-channel adaptation. The same offer has to render differently in email, SMS, push, web, app, paid social, and in-store display. Modular components let the engine assemble the right format for each channel without re-producing the creative every time.

Then the performance loop. Which subject lines perform on which audiences. Which hero images drive higher conversion for the family persona. Which CTAs work better in SMS than in email. That data flows back into the variant library so weaker variants retire and stronger patterns get reinforced. The library learns alongside the decision engine.

Partner content is its own track. Co-marketing activations need co-branded creative produced with the partner and reported back on reach and engagement. That workflow has to plug into the same library and decisioning engine, not run in parallel. Otherwise we end up with two content operations, one for owned campaigns and one for partner campaigns, with duplicated tooling and inconsistent governance.

The output is that the decision engine never has to fall back to a generic version, because the specific version it needs already exists.

Finally, close the loop on every send. Activation infrastructure has to be reliable across every channel. Email and SMS go out through the marketing platform. Push notifications through the app stack. Paid social through PII match feeds into Meta, Google, TikTok, and X. Direct mail through a vendor. In-store displays through operations. Account rep outreach through CRM. Each channel needs a clean pipe from the decision engine to the activation point, with the right consent checks, suppression rules, and frequency caps applied in flight.

Then event capture coming back the other direction. Every send produces signal: open, click, conversion, ignore, unsubscribe, bounce, spam complaint. Every paid social impression. Every direct mail piece that drove a verifiable response. Every account rep call and its outcome. Every in-store interaction and the offer it redeemed. The signal has to attach to Sarah's profile and to the decision that produced the send, with timestamps, so we can trace cause and effect.

Then attribution. Which message drove which outcome. This is harder than it sounds in a multi-channel world where the email primed the decision, the retargeting ad reinforced it, and the SMS closed it. We need a measurement framework that includes multi-touch attribution for marketing-influenced revenue, holdout testing for incremental lift, and marketing mix modeling for the bigger picture. Lift, not just open rates. Incrementality, not just response rate. The question we have to answer is whether the campaign actually moved Sarah to do something she would not have done anyway, and that requires honest measurement, not vanity metrics.

Then the model retraining loop. The outcomes flow back to the feature store and the model registry. The propensity model that scored Sarah at 0.7 to renew either was right or was wrong. That outcome retrains the next version of the model. Over a year this turns into compounding improvement. Without it, the models stay frozen on the day they were deployed.

Then sales and service feedback. When an account rep calls Sarah and learns her son outgrew the thing she bought him but she has a younger daughter just getting into an adjacent product, that note has to flow back into the foundation. Customer service touchpoints generate the same kind of signal. The reps are the closest thing we have to a human-in-the-loop sensor on the customer record, and most organizations waste that signal by not capturing it back.

Then the partner side. Co-marketing partners need to see what their activation actually delivered: reach, frequency, engagement, attributed lift on whatever outcome they paid for. The reporting has to be clean, governed, and auditable, because the alternative is selling on unverified numbers that erode the partnership the first time the partner checks our math. Partner reporting runs on the same closed loop that powers our own personalization, just with the data shared back to the partner in a privacy-safe way.

Without the loop, the program is open-ended. We send, we report, we move on. The same things happen again next week. With the loop, every customer we message teaches the system how to message the next customer better, and every year the program is more effective than the one before.


Not all of this happens at once. None of it has to. The cheapest place to start is a collection inventory, going touchpoint by touchpoint and documenting what we are capturing and what we are not. It is a two-week exercise that tells us which downstream pieces are starved and where to spend first. The consent fix is the most urgent because the exposure is real today. Identity resolution and record completeness are the longest runway. Decisioning and content can be built incrementally as the foundation stabilizes. The feedback loop comes last in sequence and first in importance: it is what turns the program from a series of sends into a system that compounds.

If I had to name the weakest link, in a retailer or a bank or a health system, it is the connective tissue between the customer foundation and the activation systems. They have data. They have customer records. They have a marketing platform. What they do not have is identity resolution they trust, consent that reconciles, a decision engine that picks what each customer should hear, and a feedback loop that makes the system smarter. That is where the first dollar should go.

None of this is a platform you can buy. It is six hard things, done in order, kept alive. Do them and Sarah gets a message that sounds like it was written by someone who knows her, because it was. Skip them and she gets a first-name token, the same as everyone else, and she keeps not renewing. The difference between those two outcomes was never the software.

personalization data-strategy customer-data martech ai-strategy