4 ways to correct bad data and improve your AI

AI success starts with better data practices. Learn to turn flawed datasets into opportunities for smarter marketing analytics.

Chat with MarTechBot

As marketing analytics rapidly evolves into an AI-driven field, one major challenge threatens to derail progress: bad data. While AI excels at turning vast amounts of information into actionable insights, its effectiveness depends on well-planned and well-managed datasets.

Bad data leads to poor predictions, bias, flawed insights and unintended outcomes. To address these risks, companies invest heavily in data cleaning, validation and governance — an essential, time-consuming, complex process.

For analysts, prioritizing better measurement and understanding the business context behind their data is critical. That’s why analysts must lead the efforts to optimize data for AI. Here are four strategies to extract insights from flawed datasets while improving data hygiene and planning.

1. Identify corroborating data

It’s often possible to use other data sources to corroborate the metrics you’re trying to measure. For example, I worked with a retailer who claimed their inventory data was unreliable — a major issue. However, point-of-sale (POS) data identified fast-moving SKUs that suddenly exhibited zero sales. 

Although the inventory system showed low stock levels (but not depletion), the sales patterns clearly indicated an inventory issue affecting revenue. Using this insight, we adjusted replenishment thresholds and triggers to keep high-demand merchandise in stock, mitigating revenue loss.

Dig deeper: How to make sure your data is AI-ready

2. Investigate the ‘bad reputation’

Sometimes, a dataset earns a bad reputation due to “noisy outliers” that receive disproportionate attention. While noticeable, these errors often represent a small proportion of otherwise accurate data. 

For instance, I worked on household policy data for a personal lines insurer. There were cases where policies were wrongly grouped under the same household or separated incorrectly. We found several issues — such as incorrect or repeated addresses and policies sold by different agents — drove most of the errors. We cleaned the dataset by writing corrective code, turning it into a reliable resource.

3. Differentiate between zero and null

Missing data can hinder decision-making. So, the first step is determining whether values are genuinely missing or simply recorded as zero. Understanding the logic behind how the data is generated is crucial, as “no activity” (zero) is not the same as “missing information” (null). If the data is truly missing you have two options. 

  • Are there proxy values or variables that can estimate the missing values? This may involve experimentation with combined variables. 
  • Can the business question still be addressed using the available data? 

In most cases, missing data is more of a hurdle than an insurmountable obstacle.

Dig deeper: The data analytics hierarchy: Where generative AI fits in

4. Use random error to your advantage

Sometimes, bad data is too time-consuming to fix or outright unfixable. However, if the errors are random, they may cancel each other out. This allows meaningful differences between groups or periods to still be measured. 

For example, my team worked with web traffic data from two recently merged brands. Each brand had its own analytics platform, which provided slightly different measurements and faced visitor identification issues. 

Since there was no reason to believe one brand’s platform was significantly more flawed than the other, we assumed errors were random. The segmentation factors were similar across both brands, enabling us to analyze segment-level differences effectively. This combined segment-driven strategy saved the company millions.

Making the most of flawed data in an AI-driven world

These strategies are not exhaustive, as every data challenge is unique. However, too often, companies abandon flawed datasets prematurely, focusing solely on the lengthy process of fixing the data. These interim strategies demonstrate how valuable insights can still be extracted from imperfect datasets.

At the same time, companies must not feel constrained by their current data. In many cases, generating new, more relevant data can happen quickly, particularly in digital marketing. Using corroborating data, addressing reputational issues, distinguishing between zeros and nulls and strategically using random errors, analysts can unlock the value in flawed datasets and help build a strong foundation for AI-driven success.

Dig deeper: The AI-powered path to smarter marketing

Email:


Contributing authors are invited to create content for MarTech and are chosen for their expertise and contribution to the martech community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


About the author

Shiv Gupta
Contributor
Shiv Gupta helps clients develop data, analytics & digital strategies to drive compelling relationships with customers and employees. Shiv brings over 18 years of data-driven marketing experience at leading brands and consultancies including Exelon, Farmers Insurance, Merkle, Prophet, and Lippincott - Oliver Wyman. Shiv has also led strategic engagements with a diverse portfolio of blue-chip clients such as Anthem Blue Cross, Intel, Guardian Life, Novant Health, Crate & Barrel, and others.

A noted expert on marketing effectiveness and the use of data and technology to advance growth strategies, Shiv’s work has been broadly recognized for its innovative approach towards retention and profitable loyalty. He is a regular speaker at conferences and has been interviewed/ published in numerous publications including Financial Times, Ad Age, Target Marketing, and Loyalty Management.

Shiv has a depth of knowledge and expertise developing and executing data-driven marketing strategies with fortune 500 companies. This includes building the first marketing analytics department at Farmers Insurance, where he was recognized as a Frost & Sullivan “Growth Best Practices” business leader. As the principal and CEO of Quantum Sight Marketing, his focus is helping clients navigate the complex landscape of data and technology to achieve clear pathways to growth and profitability.

Shiv has experience in the Insurance, Healthcare, Energy, Retail and CPG Industries and is an MBA graduate of the University of Chicago- Booth School of Business. Currently, Shiv is also a regular contributor to MarTech.

Fuel up with free marketing insights.