Research
Slippage Modeling: The Difference Between Paper and Live P&L
Slippage is the largest single gap between backtest performance and live performance for most retail systematic strategies. Three escalating levels of realism — and why the conservative side is almost always the right call.
"My backtest shows a Sharpe of 1.8. Live performance has been Sharpe 0.3. What changed?"
Most of the time, nothing changed. The backtest didn't model slippage realistically. The "1.8 Sharpe" was a paper number; the live "0.3" is closer to the actual edge minus actual costs.
Modeling slippage well doesn't make a bad strategy good. It does prevent you from deploying a strategy that looks good only because you underestimated costs.
Where slippage comes from
- Spread. The bid-ask gap. You cross half (or all) of it on every trade. For liquid equity-index futures: 1 tick. For thinly traded micro-caps: dozens of ticks.
- Market impact. Your order moves the price. The larger your order relative to ADV, the more it moves.
- Latency. The time between signal generation and order arrival. In high-frequency strategies, the price has often moved away from the signal level by the time the order is in the book.
- Adverse selection. Counterparties who only fill your quote when the price is moving against you.
- Liquidity gaps. News prints, opens, closes — moments when liquidity vanishes and remaining quotes are wide.
Three levels of realism
Level 1: Constant slippage
A fixed cost per trade, in basis points or per-share/contract. Acceptable for low-frequency, large-cap strategies where the spread dominates and market impact is negligible.
slippage_bps = 5 # for liquid US large-cap equity, 5bp per round trip
Level 2: Volume-relative slippage
Slippage scales with order size relative to typical volume.
slippage_bps = base_spread + k × (order_size / ADV)^α
α is typically 0.5 (square-root market impact, well-documented in the empirical literature). k is a calibration constant per instrument or per liquidity tier.
This is the right level for daily-to-weekly horizon strategies trading reasonable size.
Level 3: Microstructure-aware
Models the order book directly: queue position, cancel/refill dynamics, hidden liquidity, exchange-specific behaviors. Required for higher-frequency or market-making strategies. Almost never appropriate for retail systematic strategies, which generally operate at daily-or-slower horizons.
The 2× rule
Whatever slippage estimate you use — calibrated, naive, or somewhere between — deploy with 2× that estimate in the backtest evaluation. If a strategy's edge survives 2× realistic slippage, it has margin for execution surprises. If it doesn't, it doesn't deserve live capital.
This rule is the closest thing to free protection in systematic research. The strategies it filters out are the strategies that would have failed live anyway.
Where backtests systematically lie
-
Mid-price fills. The most common backtest error. Filling at the mid systematically halves slippage. Use bid for sells, ask for buys — or model the spread explicitly.
-
Stop-loss assumptions. Backtests fill stops at the trigger price. Live fills land where they land — often meaningfully worse, especially in fast moves and gaps.
-
News-event fills. Liquidity vanishes during major prints. Backtests on mid prices around FOMC are wildly optimistic.
-
Limit-order fills. Did your limit actually fill, or were you queue-lower than you thought? Backtests usually assume fills if the price touched; reality is harsher.
Practical takeaways
- Always model spread explicitly. Mid-price fills are the largest single source of paper-to-live divergence.
- Use 2× your realistic slippage estimate as the deployment threshold.
- Stops fill in the direction of the move that triggered them. Model that asymmetry directly.
- The shorter the holding period, the more slippage matters. A daily strategy can absorb the costs; a 30-minute strategy probably can't.