Dirty Little Data Secrets: 5 Potential Inaccuracies In Your Social Media Reports
Remember that time long, long ago (in a land far away) when the term “web analytics” hadn’t even been coined? Fast-forward to 2013: web analytics data is the tip of the iceberg, “big data” is readily available, and many new tools on the market do the work of collecting data, storing it, manipulating it and […]
Remember that time long, long ago (in a land far away) when the term “web analytics” hadn’t even been coined? Fast-forward to 2013: web analytics data is the tip of the iceberg, “big data” is readily available, and many new tools on the market do the work of collecting data, storing it, manipulating it and reporting on it.
No doubt about it, Data-Land is a much sweeter place than No-Data-Land, and as marketers, we’re lucky to have arrived here. Data provides insights that enable us to market smarter, to optimize faster, and to return provable results.
And yet, a dirty little secret about Data-Land keeps the marketing world up at night (especially the night before we present our data)…
At least some of this data is probably wrong.
And that’s ok – if you know where the holes are. Understanding where data falls short will help you field questions and challenges and maybe even fix at least some of the inaccuracies before others notice them.
Here are 5 examples of potentially inaccurate numbers you may be using in your social media reports and dashboards:
Visits To Your Website From Social Channels
Let’s start with something simple – or seemingly simple. How much traffic did Twitter, or Facebook, or YouTube drive to your website? Theoretically, you should be able to find the answer quickly and easily in Google Analytics (GA). But be careful! You may have already noticed that social traffic numbers from GA’s Social Reports (GoogleAnalytics>Traffic Sources>Social>Network Referrals) seem to under-count the amount of traffic that was truly generated by social sources.
One of the reasons for this is that the numbers in GA Social Reports don’t always include all traffic generated from links you send out with custom UTM parameters. For example, if you send out a link like this, it may not get counted in the number reported in Social Reports:
This happens because GA matches social referrals based on the domain name (facebook.com), so the source in your URL, “facebook,” is not categorized as a social media site by GA. (In other words, GA counts the visit as a visit from “facebook,” but doesn’t see “facebook” as a social site.)
You can get a better count from GA by going to Traffic Sources>Referrals and searching the name of a social site, for example, “twitter.” But even this won’t provide you with a full count because the t.co referral won’t show up. (When you attach a link to a tweet that is longer than 19 characters, Twitter will wrap it in a t.co link.)
To get more accurate data from GA on social visits, try setting up an advanced filter using a RegEx (here is more background on why these issues exist in GA and how to set up the RegEx filter).
Attribution (the practice of crediting a source like social media with an “assist” if social touched a prospect somewhere along the way to becoming a customer) is a handy concept, especially for social media marketers.
Unfortunately, social media “assists” may also be significantly undercounted by GA for the same reason discussed above — i.e., only the visits or conversions sourced to “facebook.com” are credited as being referred by a social media site. Other sources that should be categorized as social, like a URL with the custom parameter “source = facebook” or a t.co referral don’t get counted as a social media assist.
This is not just a GA problem. The number of leads and socially assisted sales reported by tools that place code on your site for tracking (for example, many marketing automation tools) often under-report for the same reason.
Clicks On Shortened URLs
If you are including clicks on links in your social reports, and the source of your data is a URL shortener (either a stand-alone shortener like Bit.ly or a shortener that is built in to your SMMS system), you may be reporting on numbers that are inflated by bots and crawlers.
Some of these clicks are filtered out in reports generated by the URL shortener you are using. However, if you compare the number of clicks reported by the URL shortener to visits data from Google Analytics, you’re likely to see a significant difference (this is the probably the primary reason behind the discrepancies in Twitter traffic noted recently in this post).
Monthly Facebook Metrics
Facebook is a key channel for many brands, so your reports and dashboards likely include some metrics that come from Facebook Insights. Here is a problem to watch out for: Facebook provides many metrics in daily, weekly, and 28 day increments – not on a monthly basis (the exception to this is post-level metrics which are reported on as lifetime values).
At least some of your reports and dashboards are probably based on monthly metrics. If so, you have two choices:
- Use the Facebook 28 day number: the problem with this is that the 2-3 days you are missing could skew your total dramatically.
- Calculate a 30 day metric: this is perfectly legitimate for some metrics, but not for others. Facebook metrics that are based on unique numbers cannot be turned into a 30-day number (without using an estimate). More on this here.
The key is simply to know what you are reporting on, which isn’t a problem if you are pulling data manually from Facebook Insights. If you are getting your Facebook data from a tool, make sure you understand how the metric is being calculated and pay close attention to any unique metric (People Talking About This, for example) reported as a monthly number.
Here is one more thing to watch out for with Facebook data: the delivery of Facebook data is often delayed by two days, so running a monthly report on the first of each month will generally produce inaccurate metric values.
Share Of Voice
Few, if any, social listening tools capture every single mention of your brand or keyword, especially in the Twitter environment (you wouldn’t like the price tag if they did). What does this mean? Let’s take the metric “share of voice” as an example. If you aren’t capturing 100% of the conversations that occurred, then “share of voice” is representative of only the conversations you captured. It is not an accurate measure of the share of total conversations about a particular topic or keyword.
This doesn’t mean that “share of voice” is a metric to avoid. Just be clear about what you are measuring (and presenting).
Living in Data-Land (the land of plenty) has created an expectation that data will be available, and that’s probably a pretty good bet. But whether that data is accurate or not is an entirely different matter. The best way to avoid bad data is to:
- Be an educated consumer about the data you are using
- Track trends over time closely
This will help you spot changes that don’t make sense and determine an expected range of inconsistencies from month to month.