Bandits have taken over marketing decisioning, and that’s a good thing

Marketing decisioning has shifted from rules to AI — fueled by data warehouses and advanced models like reinforcement learning.

Chat with MarTechBot

Marketers have been using decisioning technology for decades — automated rules that decide which email to send, when to follow up or how to score a lead. What’s new isn’t the idea of automation, but the intelligence behind it. Rule-based systems have given way to AI engines that learn, adapt and personalize in real time.

The real breakthrough wasn’t just AI but the rise of cloud data warehouses. By unifying customer data at scale, they gave AI the raw material it needed to move beyond scripts and make smarter decisions. 

Now, platforms can sit on top of the warehouse and the customer data layer, offering marketers intuitive tools that activate data daily and deliver personalization at the individual level.

The foundation: Decades of marketing decisioning

Marketing decisioning predates the internet. When Unica launched in 1992 — back when the web had only about 6 million users — it pioneered software-driven campaign management, featuring automated workflows and lead management. This was decisioning at its most basic: if-then logic that automated repetitive tasks but lacked the data richness for meaningful personalization.

These early systems faced significant constraints. Data lived in silos: email platforms tracked engagement, CRMs tracked deal progression and web analytics tracked digital behavior. No single system had the complete customer picture required for advanced decisioning. The result was automation that could follow scripts but couldn’t truly learn or adapt.

Dig deeper: How AI decisioning will change your marketing

Platforms could handle basic segmentation (“send discount emails to price-sensitive prospects”) and sequential workflows (“if a prospect downloads a whitepaper, add them to a nurture sequence”). But they struggled with cross-channel orchestration and real-time adaptation.

When customer behavior shifted, marketers had to rebuild segments or create new journey branches manually. The decisioning was automated — but the intelligence was still human, manually revised at every turn.

The data warehouse: Enabling AI decisioning for marketing

The emergence of cloud-based data warehouses like Snowflake, Databricks and BigQuery transformed what was possible for marketing decisioning. For the first time, organizations could store and process large volumes of customer data from multiple sources in a single, unified system. This foundation made AI-driven decisioning feasible at enterprise scale.

Data warehouses solved the core problem that limited earlier systems: data fragmentation. Instead of isolated data points, AI systems could now access complete customer profiles — purchase history, browsing behavior, engagement patterns, demographics, lifecycle stage and preferences — all unified in one place.

They also enabled composable architecture. Rather than being locked into a platform’s limited data model, organizations could combine best-of-breed tools: 

  • CDPs for unification.
  • Data warehouses for storage and computation.
  • Specialized AI decisioning layers for optimization. 

This composability created the technical foundation for the AI techniques that now define modern marketing decisioning.

The three pillars of AI decisioning

Hightouch, in its technical blog series, outlines three interconnected technologies — reinforcement learning, multi-armed bandits and contextual bandits — that work together to deliver personalization at scale.

1. Reinforcement learning: Learning through experience

Reinforcement learning gives AI systems a framework for discovering what works through systematic, ongoing experimentation. Just as marketers build intuition through campaign cycles — design, launch, measure, adjust — reinforcement learning automates that process with a structured feedback loop.

An AI agent selects an action, interacts with the environment (your customers), receives a positive or negative response and chooses the next action based on its experience and desired outcomes. In marketing terms, the agent is the AI system deciding what, when and how to send to each customer. Actions can include:

  • Whether to send a message.
  • Which channel to use.
  • When to send it.
  • Which offer to include.
  • What message type to deliver.

The environment is the customer and their data. Rewards are the outcomes optimized for — purchases, clicks, opens or metrics like lifetime value. Over thousands of interactions, the system develops an increasingly sophisticated understanding of individual preferences while identifying broader patterns across the customer base.

This level of experimentation far exceeds human capacity. While marketers might test a few hypotheses per week across segments, reinforcement learning can test thousands of combinations of messages, timing and channels at the individual level — continuously refining what drives results.

2. Multi-armed bandits: Balancing testing and performance

Reinforcement learning provides the framework for learning, but multi-armed bandits address a key challenge: balancing the need to test new approaches with the need to scale what already works. 

The name comes from the classic casino problem — faced with a row of slot machines (i.e., one-armed bandits) with unknown payouts, how do you maximize winnings with limited resources?

In marketing, each arm is a choice: 

  • Subject lines.
  • Send times.
  • Creative templates.
  • Offers. 

Traditional A/B testing would compare these options sequentially, taking weeks or months to identify the winners. Multi-armed bandits allocate more traffic to better-performing options while continuing to test new ones — cutting both the time and cost of optimization.

For example, if an email campaign launches with five subject lines, traditional testing might compare them in pairs and declare a winner after days or weeks. Multi-armed bandits test all options simultaneously, then gradually shift traffic toward stronger performers while still exploring alternatives in case their performance changes.

This method extends beyond single variables. Instead of testing subject lines and send times separately, multi-armed bandits can optimize entire campaigns — exploring thousands of combinations of timing, content and offers while dynamically steering toward the best results.

3. Contextual bandits: Personalizing every decision

While multi-armed bandits identify the best option for the average customer, contextual bandits personalize decisions using individual data. Instead of asking “what works best for everyone?” they ask “what works best for this person, right now?” by incorporating context such as purchase history, browsing behavior, engagement patterns and demographics.

For example, the AI might evaluate how a customer who recently browsed cordless drills, engages heavily with email, but rarely responds to discounts, would react to different combinations of content, timing and offers.

Contextual bandits are powerful because of their dual-level learning. When one customer clicks a discount email on Wednesday evening, the system learns about that individual, while also updating its understanding of how similar customers respond — building both group-level and personal insights.

Dig deeper: How small pilots and sprint roadmaps turn AI decisioning into ROI

How the components work together

Together, these three technologies create a marketing system that operates very differently from traditional approaches. 

  • Reinforcement learning provides the framework for experimentation.
  • Multi-armed bandits balance exploration with performance.
  • Contextual bandits personalize decisions using complete customer profiles unified in the data warehouse.

The result is AI decisioning that can process hundreds of customer features and thousands of actions while making real-time, individual decisions. Instead of building granular segments and complex journeys, marketers configure AI agents that automatically make decisions for the customer.

This scale is only possible with the computational power of the data warehouse. Without it, reinforcement learning couldn’t run the thousands of experiments needed for individual optimization.

Not every platform uses the same mix of techniques. Vendors may combine reinforcement learning with other methods, such as machine learning or natural language processing, to achieve similar results.

The future of marketing work in an AI-driven world

As AI systems take over decisions on timing, channels, and personalization, marketers’ roles will shift toward higher-level strategy: defining objectives, curating content and ensuring AI operates within brand and ethical boundaries. 

The organizations that succeed will be those that pair technical sophistication with human insight into what drives meaningful customer relationships.

Dig deeper: The secret to smarter, faster marketing decisions with AI

MarTech’s daily brief features daily insights, news, tips, and essential bits of wisdom for today’s digital marketing leader. If you would like to read this before the rest of the internet does, sign up here to get it delivered to your inbox daily.


Contributing authors are invited to create content for MarTech and are chosen for their expertise and contribution to the martech community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. MarTech is owned by Semrush. Contributor was not asked to make any direct or indirect mentions of Semrush. The opinions they express are their own.


About the author

Ana Mourão
Contributor
Ana Mourao is an Experimental Marketer with extensive experience in helping large, complex B2B2C companies make CRM and Digital Marketing decisions with incomplete data using an experimentation framework. She is passionate about applying this framework to enable large organizations to make informed and effective CRM and digital marketing decisions, even when data is incomplete. Ana has successfully led the selection and implementation of a customer data platform, established compliance and data governance protocols, and collaborated with data science teams and other key stakeholders to deliver impactful insights and activations. Additionally, she is a lifelong learner and a certified professional in growth leadership, marketing leadership, retention and engagement, negotiation, and web analytics.