Chapter 5: Market Selection and Synthetic Control Methods

The foundation of reliable Matched Market Testing lies in selecting appropriate test and control markets, then creating accurate synthetic controls. This technical deep-dive covers the statistical methods and quality assurance processes that ensure robust experimental design.
Strategic Market Selection Approach
The first step in our MMT process involves carefully selecting test and control markets to ensure reliable measurement. We select medium-to-large-sized test markets that are representative of the broader market while excluding major metropolitan areas like New York, Los Angeles, and Chicago to avoid disrupting significant revenue streams.
Our market selection process combines statistical rigor with practical business considerations.
Statistical Correlation Analysis
Core Methodology
We analyze historical time-series data for key performance indicators (KPIs) such as sales or conversions across all potential markets. Using the Pearson correlation coefficient, we quantify relationships between markets over a pre-test period, typically requiring a high positive correlation (>0.7) between test and potential control markets.
This ensures control markets have historically trended similarly to test markets, creating a reliable comparative baseline.
Correlation Requirements
Minimum Correlation: 0.5 correlation required, 0.7+ preferred for highest confidence
Main KPI Correlation: >0.5 minimum, >0.7 preferred between test and control markets
Correlation Stability: Consistent correlation across multiple time windows
Log-Correlation: Often examined to account for size differences between markets
Technical Implementation
# Example correlation analysis using R
library(tidyverse)
library(corrplot)
# Calculate Pearson correlation between test and potential control markets
market_correlations <- historical_data %>%
pivot_wider(names_from = market, values_from = conversions) %>%
cor(use = “complete.obs”, method = “pearson”)
# Filter markets with >0.7 correlation to test market
suitable_controls <- market_correlations[test_market, ] %>%
.[. > 0.7] %>%
names()
Comprehensive Matching Criteria
Beyond statistical correlation, we evaluate markets across multiple dimensions:
Demographic Alignment
Population characteristics including age, income, and education levels should align within 10% variance between test and control markets.
Economic Conditions
- Similar unemployment rates
- Comparable cost of living indices
- Consistent economic growth patterns
- Regional economic stability
Market Size Considerations
- Control markets should collectively represent similar volume characteristics to test markets
- Test Market Coverage: Typically 10-15% of total business volume for optimal balance
- Minimum Volume Thresholds: At least 10 conversions per day or $500 daily revenue per test market
Competitive Landscape
- Similar competitive presence across markets
- Comparable promotional activity levels
- Consistent market maturity stages
- Similar brand awareness levels
Geographic Independence
Sufficient separation to prevent spillover effects between test and control regions, ensuring true isolation of marketing effects.
Quality Thresholds and Exclusions
Data Quality Requirements
Historical Data: 12-24 months of stable historical performance data required
Completeness: <5% missing data points across measurement period
Consistency: Stable reporting methodologies across test and control markets
Outlier Detection: Identification and treatment of anomalous data points that could skew results
Market Exclusions
Reset Periods: 90-day waiting period for markets previously used in testing
Operational Feasibility: Markets must allow precise geographic targeting within advertising platforms
Major Metro Exclusions: Avoid NYC, LA, Chicago to protect significant revenue streams
Creating the Counterfactual: Synthetic Controls
Advanced Synthetic Control Methodology
Rather than using simple averages of control markets, we employ sophisticated synthetic control methods to create more accurate counterfactuals. This approach recognizes that different control markets may be better predictors of test market behavior in different ways.
The Optimization Process
We solve an optimization problem that finds the optimal weights for control markets by minimizing the Root Mean Squared Prediction Error (RMSPE) between the test market and weighted control group over the pre-intervention period.
Example: Instead of equally weighting three control markets (33% each), our algorithm might determine that 60% Market A + 30% Market B + 10% Market C creates the most accurate replica of the test market’s historical patterns.
Technical Implementation with tidysynth
Using the open-source tidysynth R package, we create weighted combinations that produce superior “business-as-usual” baselines.
library(tidysynth)
# Create synthetic control
synthetic_control <- historical_data %>%
synthetic_control(outcome = conversions,
unit = market,
time = date,
i_unit = “test_market”,
i_time = intervention_date) %>%
generate_predictor(time_window = pre_period,
conversions = mean(conversions, na.rm = TRUE)) %>%
generate_weights(optimization_window = pre_period) %>%
generate_control()
This methodology accounts for nuanced relationships between markets and provides more accurate counterfactuals for what would have happened in test markets without intervention.
Quality Assurance Metrics
Synthetic Control Quality Scores
Causal Impact (CI) Score: <0.7 indicates reliable predictive capability
Pre-Test RMSPE: Lower values indicate better synthetic control fit
Weight Distribution: Balanced allocation across multiple control markets preferred to avoid over-reliance on any single control
Market Size and Power Metrics
Control Pool Size: Minimum 10-15 potential control markets for robust synthetic control creation
Geographic Distribution: Ensure test markets don’t over-represent specific regions
Volume Balance: Control markets should collectively match test market characteristics
Pre-Test Validation
Pre-Test Fit: Measures how closely the synthetic control matches test market historical trends
Weight Distribution: Ensures balanced allocation across multiple markets
Stability Testing: Validates consistent weights across different time periods and calculation methods
Adaptive Recalibration: Allows weight adjustments if market dynamics change significantly between test design and execution
Synthetic Control Advantages
This synthetic control approach transforms multiple imperfect control markets into a single, highly accurate counterfactual that captures the complex dynamics influencing test market performance.
Benefits Over Simple Averaging
- Higher Accuracy: Optimized weights create better historical fit
- Reduced Variance: More stable counterfactual predictions
- Flexibility: Adapts to unique market characteristics
- Transparency: Clear mathematical foundation for market weighting
Quality Validation Process
- Historical Fit Assessment: Evaluate how well synthetic control replicates test market pre-period
- Weight Reasonableness: Ensure no single market dominates synthetic control
- Stability Testing: Validate consistent performance across different time windows
- Placebo Testing: Apply methodology to non-test periods to validate approach
Implementation Best Practices
Market Selection Workflow
- Historical Data Collection: Gather 12-24 months of market-level performance data
- Correlation Analysis: Calculate Pearson correlations between all market pairs
- Multi-Criteria Filtering: Apply demographic, economic, and operational filters
- Synthetic Control Creation: Use tidysynth to optimize control market weights
- Quality Validation: Assess synthetic control fit and stability metrics
- Final Selection: Choose test and control markets meeting all quality thresholds
Tools and Resources
Primary Tool: tidysynth R package for synthetic control implementation
GitHub Repository: edunford/tidysynth
Additional Resources: Causal Impact package for post-test analysis validation
This rigorous market selection and synthetic control process ensures that MMT results provide reliable, actionable insights for marketing optimization and scaling decisions.
Next Steps: Learn about Platform-Specific MMT Implementation
Our Editorial Standards
Reviewed for Accuracy
Every piece is fact-checked for precision.
Up-to-Date Research
We reflect the latest trends and insights.
Credible References
Backed by trusted industry sources.
Actionable & Insight-Driven
Strategic takeaways for real results.