Marketing Mix Modeling is Here To Stay
Marketing Mix Modeling (MMM), a statistical modeling technique that helps companies estimate the impact of different marketing tactics on sales and help optimize marketing budgets and mix allocation, has been around for over 70 years.
Over the years, it has become quite popular with marketing departments of Fortune 1000 companies in the Consumer Products, Retail, and Pharmaceuticals, partly due to the availability of robust point-of-sale and market-level syndicated data in these industries. Consulting and information services firms have made substantial profits carrying out MMM projects.
While there's been a gross imbalance in client fees vs. effort or difficulty, the MMM analytical exercise can be compelling and impactful – despite its relative simplicity. For most companies, advertising and promotions are leading drivers of operating expenses, and having a good sense of what you're getting in return for your marketing spending is crucial.
The death of (3rd party) digital IDs
As recently as a couple of years ago, Marketing Effectiveness and Optimization were all about consumer-level deterministic attribution, with many in the industry preaching about the demise of Marketing Mix Modeling. Fast forward two years, and thanks to GDPR, CCPA, and the increased policy sentiment about consumer privacy, 3rd party digital IDs are going away, and all major tech firms will have phased them out by 2023.
Today it's all about consumer choice and customer experience. Marketers will still be able to leverage zero-party and first-party data given consumer consent within their walled-off environments. Multi-touch attribution (MTA) and consumer-level marketing effectiveness mechanisms will be significantly more limited than in years past.
Enter our old friend, MMM. Unlike MTA, Marketing Mix Modeling is based on aggregated point-of-sale and syndicated data, not at the consumer level. With MMM, we are typically interested in the effects of our marketing mediums (Facebook, TV, Radio, digital advertising) at the market, region, or national level; thus, we do not need consumer-level data.
Modeling techniques: part science, part art
It's vital to remember that MMM is not the ideal technique for things like sales or revenue prediction. There are more sophisticated statistical techniques for demand forecasting that still allow for model explainability if needed (i.e., when your stakeholders are executives).
Typical models: bread and butter statistical regression
In the age of AI, most of us dare not talk about multivariate regression, but that's what 90% of Marketing Mix Modeling cases are. Sure, there are some variable transformations, and you need to control for things like multi-collinearity, advertising decays, lag effects, and advertising effect saturation. But the underlying modeling technique is simple, elegant, easy to understand, and most importantly, actionable.
Pay attention to adstock, saturation, and multi-collinearity
Most of us can recall an advertisement long after we've seen it. It's almost May, and I'm still influenced by Super Bowl weekend to switch from Door Dash to Uber Eats due to that funny commercial. This carryover effect of advertising from one period to the next, which decays over time, can be best modeled using an adstock transformation of the advertising variables (TV or Radio GRPs, Facebook impressions, Magazine subscribers, etc.).
Advertising saturation is another vital part of MMM and reflects advertising dollars' tendency to start diminishing returns after a while. As an example, as we initially pump more money into TV advertising, the impact on incremental sales starts to accelerate. However, after some time, it reaches a saturation point, and the positive effects of additional TV advertising begin to decline (sales grow, but at a diminishing rate). You can effectively capture these diminishing impacts in your model via advertising variable transformations that appropriately mimic this threshold and saturation phenomena (usually done through a suitable S-curve transform).
Multi-collinearity is public enemy #1 in multivariate regression, particularly in the Sales, Marketing, or Pricing domains. Independent variables are often highly correlated, resulting in over-fitting: competitive pricing correlated with our pricing, competitor sales with our distribution, promotional activity with seasonality, and so forth. We need a suitable mechanism to control our variables' multi-collinearity and employ regularization techniques. Lasso, Ridge, and ElasticNet regression are the most popular solutions. Out of these three options, I recommend ElasticNet, which combines the modeling benefits of Lasso (eliminating some predictors) and Ridge (reducing some variables close to a zero impact but never at zero).
Important to harmonize critical internal and external data
You are trying to model the impact of marketing mediums on sales dollars, units, or gross profit $, and you can quickly get them from your point-of-sale data. For the independent variables (i.e., the metrics that will influence your sales), the following exposure metrics get good, actionable results. Some of these will come from your advertising agencies, syndicated data providers like Nielsen/IRI/NPD, or public sources.
Media activity: impressions for digital advertising, Gross Rating Points for TV and Radio, and subscribers or number of readers/viewers for all other media. If you don't have available exposure metrics, you can use marketing spending as a proxy, although your model will be less accurate.
Promotional activity: price discounts, feature and display advertising, and other in-store shopper marketing.
Seasonality: in addition to decomposing your dependent variable into key seasonal components (i.e., trend, weekly or monthly seasonality), I recommend including key holiday variables.
Competitive activity: competitor sales, key promotions, price discounts, competitive advertising.
Other: any other metrics that have a substantial impact on sales, such as distribution points, macroeconomic factors like unemployment rates and inflation, weather (usually expressed as a deviation from historical averages), etc.
For Marketing Mix Modeling, I generally recommend weekly aggregated data, harmonized across the different data sources (e.g., POS, syndicated, agency data, public sources) for at least two years – thus, at least 104 weeks of data (but 156 weeks is ideal). Given geographic preferences and localized advertising, try to get as granular as possible with your modeling context and collect and aggregate your data at the market level (e.g., city, MSA, sub-region). Once you start going too broad, like region (i.e., 10+ states grouped) or national, your model may be too biased, and the results and ensuing marketing optimization recommendations are too shaky.
Limitations
There are two significant drawbacks to Marketing Mix Modeling.
For one, measuring the impact on long-term brand equity could be better. Therefore we should only make marketing investment decisions on MMM results.
The second disadvantage is that MMM favors time-specific marketing mediums like TV, Radio, or digital advertising over non-time-specific ads, so effects would be augmented for TV and dampened for such advertisements in an airline magazine.
The democratization of analytics opens doors
Historically, you could only get decent Marketing Effectiveness & Optimization studies done if you were ready to wait 12-16 weeks and shell out at least $100K. The advent of open-source software, the democratization of advanced analytics techniques, and data science research investments by tech giants have made sophisticated analytics solutions accessible to the masses.
While most small to mid-size companies (including many advertising agencies) still lack the human or tech capital to do things like Marketing Mix Modeling, the cost and effort to perform actionable MMM in-house have decreased significantly.
Meta's Robyn project is a solid open-source toolkit I encourage everyone to check out. Robyn is a semi-automated MMM package that can handle model building, including hyperparameter tuning, descriptive analytics (e.g., the share of ad spend vs. impact), and marketing spend optimization. It is currently available for R, and the team is working on a python wrapper. I've tried it and have been delighted with the results.
With careful feature engineering, and proper adstock and saturation transformations, it's a good solution for most mid-size firms who want to optimize their marketing spend and get quick wins.
For more solid MMM content (more for practitioners and data scientists, but very practical nonetheless), check out Dr. Robert Kubler on Medium.