How to gain insights in a diversity of data sources

March 30, 2023 | Jens Van de Capelle, Senior Digital Marketing Consultant and Thomas Danniau, Solution Lead

Blog Long read

Have you every wondered why the data in your CRM system does not match up with your Analytics solution? Or why Google Analytics reports fewer conversions for your Facebook campaigns compared to Meta ads reports? These are questions we have encountered numerous times.  

In this blog post, we will explain why these differences occur and provide tips to get the best insights even if you are missing data in your analytics solution (spoiler alert: you are).

Illustration of three people pointing at statistics and figures

Here is an example of what you might see when comparing a CRM system and two different analytics solutions: 

  • Google Analytics: 800 transactions (80K)
  • CRM system: 1000 transactions (100K)
  • Matomo Analytics: 700 transactions (70K) 

This is a pretty big difference and it could be caused by three things: tracking consent, technological differences, and attribution. 

Let's dive in!  

Tracking consent

Person coding behind computer

First there is the matter of tracking consent: tracking & advertising tools need consent to track orders (in the EU and EEA) while your CRM system (or wherever you are processing your orders or leads) does not. You need to process and track your orders so that you can fulfil them. So, if part of your users do not give consent for tracking, that will already account for a part of the difference.

In the end, your CRM or order management system is the most complete dataset you have. These are the orders that were registered and paid for. But most of the time you will not know where users came from and what they did on your website before ordering, making it difficult to get insights or draw conclusions.

This is one of the reasons why analytics tools were introduced in the first place: so that we can get better insights to inform our decisions.

Technological differences

Person drawing graphs on Whiteboard

Different analytics solutions use different technologies.

For instance: Google Analytics' code is built differently than Matomo's tracking code. This means that they collect data in different ways, and on top of that they also process data differently (blocking bots, deduplicating transactions with the same id or not, ...).

Many companies also still work with client-side tracking, which means that the tracking code can easily be blocked by browser plugins. This is part of the reason you will probably see other numbers when comparing client-side solutions to a server-side implementation.

Even if it is the same analytics solution (for example GA4), technologically it is still a different solution if you are using server-side vs. client-side.

Attribution

Another reason you will find differences when comparing tracking solutions is the attribution model that they are using. These different attribution models are often a source of confusion in marketing teams. Especially when looking at different traffic sources and comparing the web analytics data to the advertising platform's reports.

Let us use the example of looking at the Google Analytics conversions attributed to your Meta ads campaign vs. the numbers in the Meta ads report. In most cases you will see a lot more conversions in Meta ads than in Google Analytics.

Advertising platforms like to claim as much conversions as they can so that they can "prove" the value of the ads on their platform. So, they use an attribution model that favors them by reporting conversions that happen within a certain time frame after an ad interaction or impression. 

It is recommended to analyze this data in acquisition (push) campaigns as well. However, if retargeting audiences are included (and believe me, if you do not explicitly exclude them in Meta ads, they are) then obviously there will be conversions that occur after an impression or ad interaction, as some customers do not make an immediate purchase. 

In such cases, Meta/Google algorithms predict a higher chance of conversion for this user and serve them ads. When these customers eventually convert (even if they were not paying attention to the ad), Meta and Google are more than happy to claim that conversion and attribute it to your running campaign on their platform.  

Data models of Universal Analytics vs Google Analytics 4

Advertising platforms like Meta and Google fail to consider other touchpoints that occur between their platform and the final conversion. While Google Analytics analyzes all website traffic and the different traffic sources. That is a very important difference you should be aware of. 

In terms of attribution models, Universal Analytics uses a last non-direct click attribution model in its reports (except in the multi-channel funnels reports), while GA4 uses data-driven attribution by default. The latter model does take into account different channels and gives credit to a touchpoint's contribution to a conversion by calculating the probability that a conversion would have occurred with or without this touchpoint in the user journey, based on historical data. 

However, Google Analytics does not account for touchpoints that occur outside of the website, such as video views or ad impressions. Therefore, the user journey view is still incomplete. 

Getting more accurate insights

colleagues collaborating

Now that we know why these differences in numbers occur, let us look at how we can get the most accurate insights from this variety of data sources we are working with. 

First, it is important to know how to attribute conversions across different channels. Using GA4's data-driven attribution model is a certainly an improvement from solely relying on last click attribution or only analyzing reports from advertising platforms. Nevertheless, as mentioned, this data is still incomplete since it fails to capture data coming from touchpoints that happened off the website. 

It is important to acknowledge that achieving a complete 100% view of the customer journey before conversion is unlikely due to certain limitations. For example, there are certain touchpoints (print, out of home, ...) that are untraceable and people use several devices/browsers without always being logged in, making it impossible to connect every dot.

Nonetheless, we can improve our worldview by combining different sources such as web analytics data, Meta ads, Google Ads, DV360, Microsoft ads, and marketing automation tools in a marketing data warehouse

Based on this combined data set, we can create our own data-driven attribution model that acknowledges conversions connected to known marketing touchpoints. However, what about conversions in our CRM that we cannot trace because of user tracking consent for example? This can sometimes be a relatively big share of your conversions.

Ideally, we can also include these into our campaign evaluation to avoid undervaluing them. We can apply the same conversion and value distribution to attribute these "unknown" conversions based on the insights gained from the "known" dataset.

While this approach is still imperfect, it is better than ignoring 20% of your conversions/revenue. This is also similar to Google Analytics' "Consent Mode", only it happens outsides of GA, considering more data sources.  

Four people looking at a laptop screen laughing while one person points at the screen

To further enhance our data quality, we can work on actively improving our "identification rate". The more users can be identified via registration, form submission, or login, the more complete your data insights will become.

By investing in the identification and understanding of your users, you can reduce your reliance on third parties. This is particularly important because advertising platforms (like Meta & Google ads) are becoming black boxes.

With targeting becoming increasingly more AI-driven, understanding cross platform behavior will become even more challenging if you only rely on data from big tech companies. Shifting from a third-party to a first-party based advertising strategy can help you overcome this obstacle. Therefore, we believe that the "identification rate" should be KPI that data and marketing teams should actively work on. 

In conclusion, e-commerce data sources can show different results due to a variety of factors, like tracking consent, technical differences, and attribution models. To get the most accurate insights it is important to take into account different acquisition channels, acknowledge the limitations of each data source, and combine data from multiple sources.

In the long run, we strongly recommend working on improving your "identification rate" to lower your dependence on third-party platforms. This approach will enable you to make better-informed decisions based on a more comprehensive picture of your e-commerce data. Furthermore, you will be able to continue doing so even when Google and Meta decide to give you just a little less insights once again. 

How can we help you?


Did you like what you read? Do you want to know more about marketing data warehouses, data analytics and reporting or are you looking for a digital partner? Feel free to reach out, we would love to help out! 

Get in touch

Keep reading with these insights

illustration depicting the sunset of Universal Analytics

Google Analytics 4: Why you need to switch now

No way you can avoid it: the sunset of Universal Analytics (UA) is approaching and before you know it, Google Analytics 4 (GA4) will be high in the sky. Yet, the lion's share of companies is not prepared for this big change yet. Are you? 

illustration of error message Looker Studio in branding by The Reference

The solution to resolve Looker Studio's connection to GA4

Due to a recent Google update, users of Looker Studio (formerly known as Data Studio) are receiving various error messages when trying to connect with GA4. While there is no easy fix, we suggest considering a (marketing) data warehouse, a more long-term solution. 

Isabel Donvil, Managing Director at The Reference standing on the balcony of the Ghent office

B2B Businesses planning to swap creativity for resilience? Think twice. Think MACH

Times are changing. User expectations are increasing. Meanwhile, organizations are looking for more flexibility and scalability. As a B2B company, it could be interesting to review your digital strategy. Read more to find out Isabel Donvil's point of view. 

Share your thoughts and get the conversation going