The future of the martech stack and marketing operations is ‘unstructured’
The next wave of data management disruption is coming, as AI churns out more unstructured data. Is your stack prepared?
Like many of you, I regularly incorporate AI tools to keep up with marketing and martech trends. They provide me with summaries and extract insights that I either agree or disagree with.
While researching unstructured data management trends, I came across this ChatGPT summary of various articles, and I immediately laughed out loud.
“….the upshot is that companies will need to treat unstructured data with the same care as structured data – applying governance and quality checks – because it’s becoming a critical input.”
My reaction was: “Oh #@%*! — the same care?! We’re going to be in trouble!”
If you’re a martech or MOps leader charged with the management and quality of your organization’s data, this isn’t necessarily uplifting news.
The impact and importance of unstructured data management is going to be my theme in 2025 because I believe generative AI will make this a pressing challenge. There is so much to unpack on this single topic. We literally cannot treat it with the “same care,” as that would risk productivity and quality issues, and even greater brand impacts.
What do we mean by unstructured data?
Unstructured data is any type of information or content that cannot be neatly defined and categorized into the traditional rows and columns of the martech and MOps super tools: Excel and Google Sheets.
Unstructured data could include social media posts, email body copy, customer reviews or any form of content. It can be short or long form. It just needs to be information that hasn’t traditionally been forced into a standardized format.
Other than defining the content category and/or media type and simple tracking metadata, unstructured data doesn’t have a standardized format we can easily confine to drop downs in CRMs and marketing automation platforms (MAPs).
That doesn’t mean we haven’t tried. A classic customer survey, for example, may have a structured rating scale along with its unstructured free-text responses.
Because the unstructured data doesn’t fit neatly, its day-to-day utility is often minimized relative to other marketing data — even though we know it contains rich insights around customer sentiment, trends or product or service usage insights.
Dig deeper: 3 ways to boost your VOC program’s value through journey management
Structured and unstructured data create similar challenges
Put aside the emergence of generative AI for the moment. The foundational challenges of data management that cut across both structured and unstructured data are very similar.
They include:
- Governance: Both types of data lack of agreed-upon processes and definitions.
- What is “good enough?” We all make tradeoffs daily to account for today’s issues, and campaigns still get launched. Real and/or artificial deadlines that force us to acknowledge a minimum standard will always exist.
- Pressure to get it done! New team members and fast-paced initiatives that make it difficult to ever feel like we are making enough progress before the next wave of clean-up inevitably begins.
These foundational challenges will be tested even further thanks to the influx of newly generated data and content and new AI-infused platforms and capabilities, which will claim more unicorn-like benefits to organizations.
How big is the unstructured data problem?
A frequently cited study by IDC and Box conducted in 2022 found 90% of the data generated by organizations is unstructured. The findings were featured in a whitepaper published in 2023. Note that this timing was in parallel with generative AI starting to hit the mainstream.
We can, therefore, expect the proportional percentage of unstructured data is rising even faster when we throw AI-generated content into the mix, as well as multi-modal (image, audio/voice and video) content.
That means we’re about to be hit with a wave of data management issues that will challenge any martech stack.
Dig deeper: AI is poised to disrupt the world of martech vendors and users
Is the unstructured data challenge being recognized?
In 2024, both Salesforce and HubSpot announced acquisitions of platforms specifically targeting unstructured data issues. Keep an eye on how this gets rolled out in new features and/or embedded capabilities later in 2025.
Not surprisingly, Scott Brinker and MarTech Tribe’s MarTech for 2025 report signaled this as an area to watch for 2025, as the team teased out these trends even further.
“The ability to handle unstructured data in the cloud is crucial for unlocking the value of generative AI”
“Generative AI is changing this by providing the ability to absorb and synthesize vast amounts of unstructured content into new creative use cases.”
Titles and personas: A simple, but impactful example
You have likely worked through, or are now working toward, an agreed-upon set of personas to develop targeted, customer-focused content. In most CRM/MAP systems, this results in a set of standard drop-down menus that are co-managed by sales and marketing ops.
In many organizations, this involves a combination of hierarchy-based (C-suite, manager, etc.) or role-based personas. But in reality, we all know a fully developed persona is not just using someone’s job title. We all start there. People’s job titles are actually completely unstructured and are subject to both their own preferences on LinkedIn or the standards of their business.
To reconcile this, we’ve structured workflows like this: “contains _keywords__” in which we group profiles into “predefined personas.” While considered simple, these choices influence an entire series of one-time or ongoing campaign and content processes. They are the foundation to your content being delivered to the right people across web, email and social channels.
Take, for example, someone with “contract” in their job title. Depending on additional context, this could mean a whole series of different roles and responsibilities, from legal and compliance, to sourcing/procurement, to full-time vs. consultant. That’s clearly three different personas with varying impacts on a buying process.
Consider this scenario: a contact is tagged initially into a legal/compliance persona because of the “contract” keyword in their title, but they never engage with content targeting that persona. A further analysis of unstructured data may be able to provide alerts that perhaps the title-based persona was incorrect.
We could also analyze other unstructured data, from their public profile or internally captured data such as email, for example. If the contact was subsequently looped into meetings where transcripts or recordings are available, or if emails with your team started to reference an RFP processes, then the sentiment and unstructured data could trigger a workflow that reassesses their categorization and potentially reassigns them to a sourcing/procurement persona. This would be hugely impactful in any account qualification or deal management process.
Opportunities and challenges await
A hopeful example of this embedded capability in MAP/CRM was highlighted in one of my articles last year. I am hopeful AI capabilities will allow us to track a customer’s “tone” to drive better personalization, based on prior interactions rather than predefined drop-downs or templates that are still very broadly defined.
We will need MAP/CRM platforms to help us navigate these challenges, by being even more transparent with their use of embedded AI.
For example: what is really happening when you check the box “Customer Conversations Data” in HubSpot’s new AI capabilities.

If these capabilities are turned on by default, they all appear to help generate even more content. I hope they can also be adjusted to first help us analyze and understand the data, rather than just assuming this happens correctly behind the scenes.
Where to start
You can prepare for the challenges of unstructured data by revisiting your foundational processes to account for its impact.
Governance procedures
Expand your processes and policies to include newly unstructured data — both externally sourced and internally managed.
Specifically, I recommend we start considering a new type of metadata to recognize the original data type (unstructured) and track if it’s subsequently transformed into structured data. We have to track the “data custody chain” similar to a lead source in attribution.
New, AI-infused martech platforms will process that data with natural language processing and LLM models, with the promise of structuring it for you. If we don’t capture the original context, it will be lost forever given that generative AI is a prediction engine and not deterministic.
Start determining what is ‘good enough’
We need to establish new KPIs that allow us to assess/rate the quality and impact of unstructured data. We can no longer rely on simple measures like completion rates or unique identifiers.
We need to develop new data sampling methods to gauge how AI-infused structuring processes are working. Remember, AI models are not rule-based like our CRM/MAP.
Get it done
We will have to make tradeoffs within our original roadmaps, which focused on tackling mostly structured data. We will need to reallocate those resources and investments to start addressing unstructured data challenges. This will include new cross-functional efforts to work with customer service and product teams that own platforms outside of the martech stack for reviews, sentiment data, research, etc.
Notably, we should also start with an investment in AI literacy training for marketing and data management teams. Understanding why an AI hallucination happens will be critical to this next wave of data quality.
If we don’t invest in these efforts, the impact of unstructured data will be compounded at an exponential pace, as it is already mixing in with our existing data quality challenges. And the impact of that will NOT be the same problem we had before.
Contributing authors are invited to create content for MarTech and are chosen for their expertise and contribution to the martech community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.
Related stories