Glossary
Backtest Overfitting
Selection of strategy parameters that fit historical noise rather than persistent edge. The single biggest reason backtests outperform live trading.
Sentivue Capital··5 min read
Backtest overfitting is the selection of strategy parameters or rules that fit historical noise rather than a persistent market edge. It is the dominant reason that retail and even institutional backtests fail to replicate in live trading.
How it happens
Overfitting compounds through three vectors:
- Parameter search. Trying many parameter combinations and picking the best. With enough trials, something always looks great in-sample.
- Rule iteration. Adding entry/exit conditions until the equity curve smooths out. Each added rule is another implicit parameter.
- Data dredging. Running the same backtest on many instruments, time ranges, or market regimes and reporting only what worked.
The probability of finding a high-Sharpe rule by chance grows with the number of trials. Bailey & López de Prado (2014) formalize this as the "deflated Sharpe ratio."
Detection
- Walk-forward equity curve worse than in-sample by >50% → likely overfit.
- Sensitivity test: perturb each parameter ±20%. Robust strategies degrade smoothly. Overfit strategies fall off a cliff.
- Out-of-sample Sharpe << in-sample Sharpe. Standard tell.
- Strategy uses an unusual number of conditions for the claimed economic rationale.
Mitigation
- Walk-forward optimization — necessary but not sufficient.
- Hold out a final test set never seen during research. Test once. If it fails, the strategy fails — do not re-optimize.
- Pre-register hypotheses. State the rule and parameters before looking at data.
- Keep the search space small. Fewer parameters; clear economic rationale for each.