Martech: Martech is Marketing Logo
  • Topics
    Transformation
    Operations
    Data
    Experience
    Performance
    Management
    Special Reports
    All Topics
  • Conference
  • Webinars
  • Intelligence Reports
  • White Papers
  • What is MarTech
    Mission
    Team
    Newsletter
    Search Engine Land
    Third Door Media

Processing...Please wait.

MarTech » Data » Synthetic data: More than just make-believe

Synthetic data: More than just make-believe

Synthetic data may not be real data, but there may be some important, real world, digital marketing use cases for it.

Marketing Technology on March 1, 2021 at 2:20 pm

Digital marketers work with real data all the time. What the online shopper does tells you a lot about what they want. But as we know, you need to be very careful about personally identifiable information (PII).

You can anonymize online shoppers by taking their names off their records, before analyzing the data. Or you can use an algorithm to synthesize observed online behavior, and use that “synthetic data” for your analysis.

That may seem like overkill. Why go through this effort when you have real data at your fingertips? Synthetic data will not be a replacement for real data, but it does have some specific use cases that a digital marketer may find useful.

Synthetic privacy in the real world

“The use case is important,” said Cem Dilmegani, founder of AIMultiple, an AI consultancy based in Germany. In this case, data privacy laws, like Europe’s GDPR, become a factor.

“As a marketer, you need data to run experiments and optimize pricing. This data includes personal data as well.” he said. “Personal data cannot be stored at a certain level.” Synthetic data should bypass the privacy issue by allowing digital marketers to simulate campaigns and outcomes.

“Synthetic data has some limitations when it comes to the representation of real customers and their actual behaviours,” said Maciej Pondel, a researcher and machine learning specialist at Unity Group, a digital commerce firm based in Wroclaw, Poland. “Nevertheless, in situations where there are some restrictions in regular data acquisition possibilities (e.g. GDPR compliance or the limited size of datasets), synthetic data can constitute an excellent representation.”

“[I]n most cases, the anonymization of real data seems better. Anonymized data includes all patterns pulled directly from reality,” Pondel added. “However, when there are sporadic cases or outliers in our data…traditional anonymization methods fail. Even if anonymization provides formal GDPR compliance, companies that don’t use synthetic data to protect outliers can lose their positive image if such anonymized data leaks out.”

Reality is messy — in a good way

Still, there are advantages to working with the real thing.” With real world data, an analyst can “tease out the nuances and hidden patterns not revealed by other techniques,” said Steven Ramirez, CEO of Beyond the Arc, a San Francisco Bay Area firm specializing in CX, strategic communications and data science.  Using an algorithm to synthesize the same data, however,  “can introduce a fatal flaw” in identifying those patterns of activity, he said.

Predictive modeling relies on multiple data sources, as well as groups of models, Ramirez said. “There is an opportunity to use synthetic data to extend data sets and provide more data where it is sparse.” It is up to the analyst to understand the integrity of each data source.

“[S]ynthetic data will never be as accurate as real data,” Pondel said. “Even if generated based on real patterns, synthetic data always misses the essential ‘reality factor’, which only makes it useful in a limited number of business cases.”

“You magnify problems getting further away from source data,” Dilmegani said. Most algorithms will replicate the distribution in the source data. “Mistakes are replicated in the synthetic data as well.”

Mind the synthetic gap

Machine learning  is very data hungry, Dilmegani pointed out. Some need may emerge for data marketers to purchase synthetic data in order to have enough data train an AI application. “This will drive the demand for synthetic data.” Dilmegani said.

For example, one application for synthetic data might be to train the AI that will operate a self-driving car. Synthetic data has also been used for the deep-learning applications needed for image processing, Dilmegani noted, a technique that has been around for almost a decade.

“I am skeptical about the uses of synthetic data.” Ramirez countered. “If you are building a machine learning/artificial intelligence model, it is not a good fit.” This goes to the heart of machine learning as it relates to artificial intelligence. About 60 to 80% of the work building an AI model is spent acquiring and preparing the data, Ramirez explained. Indded, this process “is the work.”

“The approach is to apply an algorithm or process to be able to create new data points,” Ramirez continued. “Synthetic data is produced by a process that is also subject to bias. Usually, we think of data as the ultimate source of truth…Often, we talk about letting the data speak,” Ramirez said. If the data is manufactured, then what is it saying?

“The smart application of synthetic data in training AI models can also exclude any bias that could be generated from AI models trained on real data.” Pondel said. “Regarding accuracy, in my opinion, synthetic data can be comparable to real data in a few cases.”

Synthetic prediction

Applying synthetic data to digital marketing is going to be an evolution, not a revolution. Applications will be narrow and need-driven. It will become another tool in the toolbox. “At the moment, I recognize simulations and model testing/verification as the most promising area of synthetic data applications.” Pondel said.

Machine learning is data intensive, so the demand for data may drive the use of synthetic data, added Dilmegani. Like many things in machine learning and AI, synthetic data will evolve, Ramirez said. As use cases narrow, digital marketers will get a better sense of when synthetic data is a good fit, and when it is not, he said.


Opinions expressed in this article are those of the guest author and not necessarily MarTech. Staff authors are listed here.


New on MarTech

    Getting started with the Agile Marketing Navigator: Aligning on a Guidepoint

    Webinar: Benchmark your social media performance for a competitive edge

    Gartner announces the 2021-22 Genius Brands

    4 strategies to help marketing teams improve workflow and collaboration

    What you need to know from Google Marketing Live

About The Author

Marketing Technology
Martech is a conference for the growing community of senior-level, hybrid professionals who are both marketing-savvy and tech-savvy: marketing technologists, creative technologists, growth hackers, data scientists, and digital strategists.

Related Topics

Data

Get the daily newsletter digital marketers rely on.

Processing...Please wait.

See terms.

ATTEND OUR EVENTS

The MarTech Conference logo.

June 7, 2022: Master Classes

September 28-29, 2022: Fall

Start Discovering Now: Spring

Learn More About Our MarTech Events

The SMX Conference logo.

June 14-15, 2022: SMX Advanced (virtual)

November 14-15, 2022: SMX Next (virtual)

March 8-9, 2022: Master Classes (virtual)

Learn More About Our SMX Events

Webinars

Benchmark Your Social Media Performance For a Competitive Edge

Take a Crawl, Walk, Run Approach to Multi-Channel ABM

Content Comes First: Transform Your Operations With DAM

See More Webinars

Intelligence Reports

Enterprise SEO Platforms: A Marketer’s Guide

Enterprise Identity Resolution Platforms

Email Marketing Platforms: A Marketer’s Guide

Enterprise Sales Enablement Platforms: A Marketer’s Guide

Enterprise Digital Experience Platforms: A Marketer’s Guide

Enterprise Call Analytics Platforms: A Marketer’s Guide

See More Intelligence Reports

White Papers

The State of Influencer Pricing

How to Measure Influencer Performance

Reputation Management For Healthcare Organizations

Unlock the App Marketing Potential of QR Codes

Realising the power of virtual events for demand generation

See More Whitepapers

Receive daily marketing news & analysis.

Processing...Please wait.

Topics

  • Transformation
  • Operations
  • Data
  • Experience
  • Performance
  • Management
  • All Topics
  • Home

Our Events

  • MarTech
  • Search Marketing Expo - SMX

About

  • What is MarTech
  • Contact
  • Privacy
  • Marketing Opportunities
  • Staff

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • Newsletters
  • RSS

© 2022 Third Door Media, Inc. All rights reserved.