Determining the impact of an advertising campaign is hard.
Determining the impact when you don't have total control over who does and does not see an advertisement is even harder. That is exactly the case when working on Out Of Home (OOH) campaigns.
We're not talking about impressions, audiences, or CPM - all of those are useful and valuable on their own. We are talking about a verifiable impact on your business.
More clicks, more store visits, more purchases.
On Random Selection
Most good studies start with two groups - an exposed group and an unexposed group. This allows a data scientist to compare how the exposed group behaves vs how the unexposed group behaves. In simple terms, If you saw the advertisement, did you act differently? The key to doing this succesfully is ensuring that the selection is random.
To understand what we mean by random, let's use an example. Say you've bought a set of billboards in Manhattan. You "expose" the majority of people working or living in Manhattan over a long time period.
You could theoretically (and incorrectly) find an unexposed group by finding a set of people that work or live in New Jersey. This does not work because you have an inherent selection bias; the act of placing ads in a specific place means that you have selected a subset of the population to expose.
In the same way, if you have 100 vehicles driving around the city and have rented 100 billboards you can't exactly randomly assign people to see the ad. Furthermore, if you also have long running tv, radio, and online advertising campaigns, how do you separate the value from the campaign you just started versus the campaigns that were already running? This is the question of causality.
Building A Baseline Model
The starting point of determining causality is building a baseline model. You want to find data that predicts the outcome you are trying to measure but which is not the outcome you are trying to measure. In our example, we are going to do a case study using store visits.
It turns out, the number of visits to a particular category of stores are fairly consistent over time. We take this information as well as other custom built features to model the number of visits to that store. Once the model is accurate enough we can say that we have a baseline model.
So what do we do with that baseline model? We use it to simulate what happens in an environment where everything else is held constant. The number of visits to a particular store becomes a mathematical function of total visits to that category of stores, total visits to all stores, and several other proprietary features that we have built using our expertise working with GPS data.
We focus on a metric we call "Percent of Category". Because we are working with population-level statistics, the meaningful metric we look at is best described as the percentage of visits to your store(s), vs all visits to stores in your category. A category could be Computer Hardware, Grocery, or Cafe. If your "Percent of Category" is 5%, that means that 5% of all store visits in your category are to your stores. Stores here could be one store or many stores.
Even better, we already indexed the vast majority of companies and they are ready to go in our system. There is little effort required by the client to allow us to directly measure business impact.
Calculating Daily Boosts in Visitation
The next step is to determine two things.
- Does the model perform accurately before my campaign starts, and;
- What are the point-effects (i.e. how is the visitation rate increasing during my campaign)
Once we have this in hand, we can decide whether to go back and improve our model or move to the next steps. Our model is properly tuned already and we can see that in the results. Before the campaign starts the model the ground truth just as often as it is below. In fact, it nets out over the pre-campaign period to be almost 0. We can easily see the behavior change after the start of the campaign, and we see a subsequent boost to store visits on the vast majority of days
Accumulating Performance Gains
Each day we accumulate the gains in visitation rate from our 250,000 person population in this study in Atlanta. Keep in mind we are looking at billions of detailed GPS data points and tens of millions of visits. Our sample size can never include the entire population, but it covers a large portion of the U.S.
Making Sense of The Data
Hopefully, you now have a good idea of what we are doing, but we haven't yet given a real hard determination of the success of the campaign.
In this specific campaign, we boosted store visits by +6.8% with a 95% confidence interval of [3.2%, 10.5%]. The probability of obtaining such a result by chance is very small (in statistical terms less than 0.01%), so we can conclude that the advertising campaign was indeed successful.
There is another step we'd like to take and attribute this to actual dollars, but we don't have that data from our client for this campaign.
If you are interested in learning how far your ad-spend goes, please get in touch. Calculating ROI on advertising is difficult but not impossible, and we'd love to work with you to help you understand what you are getting back.
If you are interested to learn more, please get in touch!
Jesse Moore - Head of Data Science - firstname.lastname@example.org