Backtest vs Forward Test Comparison

Why Forward Results Matter More Than Backtests
Data Period: November 2025 - February 2026 | Analysis Date: February 22, 2026

๐Ÿ“Š Executive Summary

Overfitting Detected

0.34
Backtest/Forward Correlation
Only 22% of mathematically-validated strategies (5 of 23) proved viable in forward-testing. This low correlation indicates significant curve-fitting in historical backtests.

Strategy Survival Rate

22%
Viable Strategies
5 truly viable (score โ‰ฅ70), 6 conditionally viable (50-69), 12 eliminated due to negative expectancy or severe degradation.

Portfolio Performance Gap

-20.3%
Return Degradation
Backtest promised 12-18% annual. Forward reality: -8.3% in 3.5 months. Sharpe dropped from 1.2-1.5 to 0.34.

Market Regime

High Vol
Feb 2026 Crypto Crash
BTC -52% from highs, VIX spiked to 35. This stress test exposed overfitting and regime-specific optimization.

๐Ÿ“ˆ Portfolio-Level Performance

Overall Portfolio: Backtest vs Forward Test

Metric Backtest (Expected) Forward Test (Actual) Delta Status
Total Return (annualized) 12-18% -8.3% (3.5 mo) -20.3% FAILED
Sharpe Ratio 1.2-1.5 0.34 -0.86 FAILED
Max Drawdown 15-20% 31% +11% EXCEEDED
Win Rate 54% 46% -8% DEGRADED
Profit Factor 1.4 0.89 -0.51 UNPROFITABLE

๐ŸŽฏ Strategy-by-Strategy Comparison

Detailed performance comparison for each strategy that passed mathematical validation. Metrics show backtest projections vs actual forward-test results.

TIER S: Core Strategies (Mathematically Sound)

Funding Rate Arbitrage
Structural Friction
A+ VIABLE
Win Rate
65% โ†’ 71%
+6% โœ“
Sharpe
1.8 โ†’ 1.95
+8% โœ“
Expectancy
+0.95R โ†’ +1.02R
+7% โœ“
Max DD
10% โ†’ 8%
-20% โœ“

Why It Succeeded

  • Extreme funding rates during Feb crypto crash created exceptional opportunities
  • Structural edge from derivatives mechanics, not price prediction
  • Market-neutral: profits regardless of direction
  • High backtest/forward correlation: 0.92

Viability Score: 88/100 (Grade A)

Allocation: 15% of portfolio (increased from original plan)

Pairs Trading (Cointegration)
Market Neutral
A- VIABLE
Win Rate
48% โ†’ 51%
+3% โœ“
Sharpe
0.7 โ†’ 0.78
+11% โœ“
Expectancy
+0.34R โ†’ +0.38R
+12% โœ“
Max DD
8% โ†’ 7%
-13% โœ“

Why It Succeeded

  • Correlation breakdown during crash created more mean-reversion opportunities
  • Market-neutral structure protected against directional risk
  • Works in any regime - volatility increases opportunities
  • Backtest/forward correlation: 0.85

Viability Score: 79/100 (Grade A-)

Allocation: 12% of portfolio (increased)

Betting Against Beta (BAB)
Risk Premium
A- VIABLE
Win Rate
58% โ†’ 61%
+3% โœ“
Sharpe
0.9 โ†’ 0.94
+4% โœ“
Expectancy
+0.45R โ†’ +0.51R
+13% โœ“
Max DD
12% โ†’ 11%
-8% โœ“

Why It Succeeded

  • Low-beta assets outperformed during risk-off rotation
  • Flight-to-safety benefited defensive positions
  • Leverage constraint theory holds in volatile regimes
  • Backtest/forward correlation: 0.78

Viability Score: 77/100 (Grade A-)

Allocation: 13% of portfolio (maintained)

Quality Minus Junk (QMJ)
Quality Factor
B+ VIABLE
Win Rate
56% โ†’ 59%
+3% โœ“
Sharpe
0.85 โ†’ 0.91
+7% โœ“
Expectancy
+0.46R โ†’ +0.50R
+9% โœ“
Max DD
14% โ†’ 12%
-14% โœ“

Why It Succeeded

  • Quality stocks (low debt, high profitability) held up during volatility
  • Defensive characteristics provided stability
  • Lower turnover reduced transaction cost impact
  • Backtest/forward correlation: 0.82

Viability Score: 75/100 (Grade B+)

Allocation: 10% of portfolio (maintained)

Flash Crash Reversal
Volatility-Specific
B+ VIABLE
Win Rate
60% โ†’ 82%
+22% โœ“
Sharpe
1.2 โ†’ 2.1
+75% โœ“
Expectancy
+0.20R โ†’ +1.15R
+475% โœ“
Max DD
15% โ†’ 10%
-33% โœ“

Why It EXCELLED

  • Literally designed for what happened in February 2026
  • Captured +45% returns during the 8-day crypto crash
  • Volatility-specific strategies thrive in tail events
  • Low correlation to traditional factors (0.45)

Viability Score: 71/100 (Grade B+)

Allocation: 10% of portfolio (new addition)

Note: Performance spike during crash may not be repeatable, but strategy validated for crisis periods.

CONDITIONAL: Maintain with Reduced Allocation

Cross-Exchange Arbitrage
Arbitrage
B+ CONDITIONAL
Win Rate
85% โ†’ 82%
-3% โš 
Sharpe
3.5 โ†’ 3.2
-9% โš 
Expectancy
+0.63R โ†’ +0.57R
-10% โš 
Max DD
2% โ†’ 3%
+50% โš 

Assessment

  • Maintained profitability but execution risk increased during volatility
  • Slippage widened, spread volatility increased
  • Still mathematically sound but capacity limited
  • Backtest/forward correlation: 0.88

Viability Score: 73/100 (Grade B+)

Allocation: 7% of portfolio (maintained but monitor closely)

Liquidation Cascade Hunter
Crypto Volatility
B CONDITIONAL
Win Rate
60% โ†’ 68%
+8% โœ“
Sharpe
0.8 โ†’ 1.4
+75% โœ“
Expectancy
+0.20R โ†’ +0.89R
+345% โœ“
Max DD
18% โ†’ 14%
-22% โœ“

Why It Performed Well

  • Feb crash created massive liquidation cascades
  • Strategy designed exactly for these events
  • Captured +38% during the 8-day crash period

Caveats

  • Low correlation to backtest (0.42) - regime-dependent
  • Only works during extreme volatility
  • May have long periods of underperformance

Viability Score: 68/100 (Grade B)

Allocation: 8% of portfolio (opportunistic, monitor regime)

ETF/Institutional Flow
Flow Analysis
B CONDITIONAL
Win Rate
65% โ†’ 67%
+2% โœ“
Sharpe
1.4 โ†’ 1.6
+14% โœ“
Expectancy
+0.30R โ†’ +0.42R
+40% โœ“
Max DD
12% โ†’ 10%
-17% โœ“

Assessment

  • Institutional flow signals remained valid during volatility
  • ETF arbitrage opportunities increased during dislocations
  • Moderate backtest/forward correlation: 0.65

Viability Score: 65/100 (Grade B)

Allocation: 5% of portfolio (maintained)

โŒ ELIMINATED: Catastrophic Failures

VIX Contango Roll
Volatility Risk Premium
F ELIMINATED
Win Rate
70% โ†’ 35%
-35% โœ—
Sharpe
1.3 โ†’ -0.45
-135% โœ—
Expectancy
+0.95R โ†’ -0.22R
-123% โœ—
Max DD
12% โ†’ 38%
+217% โœ—

Catastrophic Failure

  • Feb VIX spike from 18 to 35 in one week
  • Strategy lost -28% in that single week
  • Classic "picking up pennies in front of a steamroller"
  • Backtest/forward correlation: 0.15 (essentially random)

Viability Score: 23/100 (Grade F)

Action: ELIMINATE IMMEDIATELY. Never deploy this strategy again.

Residual Momentum
Factor-Based
C- ELIMINATED
Win Rate
55% โ†’ 44%
-11% โœ—
Sharpe
0.8 โ†’ 0.45
-44% โœ—
Expectancy
+0.32R โ†’ +0.08R
-75% โœ—
Max DD
15% โ†’ 24%
+60% โœ—

Why It Failed

  • High correlation regime destroyed the residual return signal
  • Strategy relies on idiosyncratic returns, which vanished during market stress
  • Essentially became noise during the crash period
  • Backtest/forward correlation: 0.35

Viability Score: 42/100 (Grade C-)

Action: ELIMINATE. Requires complete re-engineering or regime-specific parameters.

Time-Series Momentum (TSMOM)
Trend Following
C+ ELIMINATED
Win Rate
55% โ†’ 48%
-7% โœ—
Sharpe
1.2 โ†’ 0.71
-41% โœ—
Expectancy
+0.65R โ†’ +0.29R
-55% โœ—
Max DD
15% โ†’ 22%
+47% โœ—

Why It Degraded

  • Momentum crashes during Feb crypto collapse caused significant losses
  • False breakouts and whipsaws increased in high-vol regime
  • Regime mismatch - optimized for trending markets, got choppy volatility
  • Backtest/forward correlation: 0.55

Viability Score: 53/100 (Grade C+)

Action: ELIMINATE from core allocation. May be suitable as conditional 2% tactical allocation only.

๐Ÿ” Overfitting Analysis

Indicators of Curve-Fitting

The forward-test revealed significant overfitting across the strategy universe. These indicators show why backtests can be misleading:

Backtest/Forward Correlation

0.34
Threshold: >0.7

Correlation of 0.34 indicates strategies ranked similarly in backtest vs forward. Most (61%) had correlation <0.5.

Expectancy Degradation >50%

39%
9 of 23 strategies

Strategies losing more than half their edge in forward-testing. Sign of over-optimization.

Sharpe Degradation >40%

48%
11 of 23 strategies

Risk-adjusted performance collapsed for nearly half the strategies.

Max DD Exceeded >50%

30%
7 of 23 strategies

Drawdowns in forward-test were 50%+ worse than backtest projections.

Common Backtest Illusions Exposed

๐Ÿ“‹ Forward-Test Methodology

How Forward Testing Was Conducted

Period: November 1, 2025 - February 17, 2026 (108 trading days)

Initial Capital: $1,130,000 (matching planned allocation)

Markets: SPY, QQQ, BTC, ETH, NQ, ES, EURUSD

Data Frequency: 1m, 5m, 15m, 1h, 4h, 1D

Realistic Execution Parameters

Volatility Adjustments (VIX > 25)

Black Swan Event Handling (Feb 3-11 Crypto Crash)

Risk Management Validation

๐Ÿ“Š Viability Scoring Framework

How Strategies Are Scored

Each strategy receives a viability score (0-100) based on forward-test performance:

Viability Score = (
    0.30 ร— Backtest/Forward Correlation +
    0.25 ร— Regime Robustness (1-10) +
    0.25 ร— Current Market Fit (1-10) +
    0.20 ร— Expected Future Performance (1-10)
) ร— 100
            

Score Categories

Top 5 Viable Strategies

Rank Strategy Score Grade B/F Corr Regime Fit Allocation
1 Funding Rate Arbitrage 88 A 0.92 8.5/10 15%
2 Pairs Trading 79 A- 0.85 7.5/10 12%
3 Betting Against Beta 77 A- 0.78 8.5/10 13%
4 Quality Minus Junk 75 B+ 0.82 7.5/10 10%
5 Flash Crash Reversal 71 B+ 0.45 7.0/10 10%

๐Ÿ“ Raw Data & Reports

๐Ÿ’ก Key Findings & Recommendations

What This Comparison Reveals

Immediate Actions Required

  1. ELIMINATE: VIX Contango, Residual Momentum, Breakout Scalper, MACD Cross, Technical Pattern Break
  2. REDUCE allocation by 50%: TSMOM, Value-Momentum Combo, all short-term momentum
  3. INCREASE allocation by 50%: Funding Rate Arb, Flash Crash Reversal, Liquidation Hunter, Pairs Trading
  4. Implement revised risk limits: Daily loss 1.5% (from 2%), Max DD 20% (from 25%), Cash reserve 10% (from 5%)
  5. Continue forward-testing: Minimum 6 additional months for conditional strategies

The Bottom Line

"The era of blind backtest worship is over. Forward-test or fail."

Deploy capital ONLY to strategies with viability scores โ‰ฅ70 that have proven themselves in live markets. The 5 truly viable strategies share common characteristics: market-neutral or defensive, structural edge (not predictive), volatility-agnostic or benefiting, and lower frequency.

โš ๏ธ Disclaimer

This dashboard compares backtested projections against actual forward-test results. Past performance does not guarantee future results. Trading involves significant risk of loss. Only trade with capital you can afford to lose. The strategies shown are for educational and research purposes. Not financial advice.