Backtest Methodology

BestFolio aggregates published tactical asset allocation strategies and backtests them under a consistent, transparent framework. This page explains exactly how our backtests are constructed so you can evaluate the results with full context.

1. Data Sources

  • Price data: Sourced from Yahoo Finance via yfinance, using adjusted close prices that account for stock splits and dividend distributions.
  • Total return: All returns assume dividends are reinvested at the time of distribution.
  • Frequency: Daily prices are fetched and resampled to monthly frequency for signal computation. Daily granularity is retained for drawdown and NAV calculations.
  • Refresh cycle: Price data is refreshed daily via an automated pipeline. Signals are recomputed at the end of each month.

2. Backtest Mechanics

  • Rebalancing: Monthly, on the last trading day of each calendar month.
  • Signal timing: Signals are computed using end-of-month closing prices. The resulting allocation is applied to the following month. For example, a signal dated February 28 uses complete February data and determines what you hold during March.
  • No look-ahead bias: Signals use only data that was available at the time of computation. No future data leaks into past decisions.
  • Starting NAV: Normalized to $100 at the beginning of each backtest.

3. Transaction Costs & Slippage

All backtests include realistic transaction cost modeling. Costs are not set to zero — they are deducted from portfolio returns at every rebalance.

  • Base cost: 10 basis points (0.10%) per one-way trade, applied proportionally to portfolio turnover at each rebalance. For example, a full 100% portfolio rotation incurs 0.10% in costs; a partial 30% rotation incurs 0.03%.
  • Stress-adjusted slippage: During periods of elevated volatility, spreads widen and execution quality degrades. Our engine dynamically scales transaction costs by up to 3× the base rate when recent 20-day realized volatility exceeds the baseline (~16% annualized). This means backtests already account for the higher friction you would experience during market stress (e.g., March 2020, Q4 2018).
  • Turnover tracking: Every backtest reports total transaction costs, annual turnover, and average trades per year in the Summary Statistics panel. You can evaluate friction impact directly.
  • What is not included: Taxes, margin costs, fund expense ratios (already reflected in ETF NAV), and behavioral factors (delayed rebalancing, panic selling) are not modeled. These vary by investor and jurisdiction.

4. Survivorship & Selection Bias

  • Source: All strategies are sourced from published academic papers, practitioner research, or well-documented public methodologies.
  • Faithful implementation: We implement each strategy as described by its original author(s). No proprietary optimization or curve-fitting is applied on top of published rules.
  • Selection transparency: We deliberately select well-known, publicly documented strategies rather than cherry-picking private or unpublished ones. This introduces a form of selection bias — we are showing strategies that gained attention, which may correlate with strong historical performance.

5. Synthetic / Proxy Data (Fallback Chain)

Many ETFs have limited history. To produce longer backtests, we use proxy data for periods before an ETF existed:

  • Fallback chain: Each ticker has a documented hierarchy of proxies. When the primary ETF has no data for a given date, the engine walks down the chain until it finds a valid source. Price continuity is maintained by scaling at each join point.
  • Transparency: The full fallback chain for every ticker in a strategy is visible on the strategy detail page under the “Price Build Log” section.

Key substitution chains (100+ tickers covered):

  • S&P 500: SPY (1993) → VFINX (1980)
  • Nasdaq 100: QQQ (1999) → ^NDX index (1985)
  • US Small Cap: IJR (2000) → NAESX (1990); IWM (2000) → ^RUT index (1987)
  • US Mid Cap: IJH (2000) → VIMSX (1998) → ^MID index (1981)
  • US REITs: VNQ (2004) → VGSIX (1996) → FRESX (1986)
  • Int'l Developed: EFA (2001) → VGTSX (1996) → PRTIX (1989)
  • Emerging Markets: EEM (2003) → VEIEX (1994) → FEMKX (1990)
  • Long Treasuries: TLT (2002) → VUSTX (1986)
  • Intermediate Treasuries: IEF (2002) → VFITX (1991)
  • Short Treasuries: SHY (2002) → VFISX (1991)
  • T-Bills: BIL (2007) → ^IRX yield (1970)
  • TIPS: TIP (2003) → VIPSX (2000) → PRTNX (1997) → VFITX (1991)
  • Gold: GLD (2004) → GC=F futures (2000) → London fixing (1968)
  • Commodities: DBC (2006) → PCRIX (2002) → ^SPGSCI (1990)
  • Aggregate Bonds: AGG (2003) → VBMFX (1986)
  • High Yield: HYG (2007) → VWEHX (1980)
  • Managed Futures: KMLM (2020) → RYMFX (2007) → AQR TSMOM (1985)
  • Leveraged ETFs: Synthetic daily-leveraged returns from underlying, with expense ratio and borrowing cost deductions

6. Walk-Forward Validation

  • Availability: Offered for strategies whose rules include tunable parameters.
  • Method: A rolling window approach — train on N months of history, test on the next M months, then slide the window forward and repeat.
  • Purpose: Prevents overfitting to in-sample data by validating that a strategy’s edge persists out-of-sample.
  • Identification: Strategies with walk-forward validation results are marked accordingly on their detail pages.

7. Metrics Computation

All metrics are computed from the monthly NAV series unless otherwise noted.

MetricDefinition
CAGRCompound Annual Growth Rate. Annualized geometric return derived from the full NAV series.
Sharpe Ratio(Mean monthly excess return × √12) / (Std of monthly returns × √12). Measures risk-adjusted return per unit of total volatility.
Sortino RatioSame as Sharpe but substitutes downside deviation (volatility of negative returns only) for standard deviation. Penalizes downside risk without penalizing upside volatility.
Max DrawdownLargest peak-to-trough decline in NAV over the full backtest period. Expressed as a percentage.
SWRSafe Withdrawal Rate. The maximum constant, inflation-adjusted annual withdrawal rate that survives every rolling 30-year window in the backtest.
PWRPerpetual Withdrawal Rate. The maximum withdrawal rate that preserves real (inflation-adjusted) capital over every rolling 30-year window.
UPIUlcer Performance Index. CAGR divided by the Ulcer Index (a measure of drawdown depth and duration). Higher values indicate better risk-adjusted performance with emphasis on drawdown pain.

8. Limitations & Disclaimers

  • Hypothetical results: All backtested performance is hypothetical. It does not represent actual trading and was not achieved with real capital.
  • No guarantee: Past performance — whether backtested or live — does not guarantee future results.
  • Remaining frictions: While transaction costs and stress-adjusted slippage are modeled (see Section 3), taxes, margin costs, fund expense ratios, and behavioral factors (panic selling, delayed rebalancing) are not captured.
  • Execution assumption: Monthly rebalancing assumes execution at the closing price on the last trading day. Intraday or next-day execution will produce different results.
  • No universal winner: No strategy is guaranteed to outperform a simple buy-and-hold approach in all market environments. Tactical allocation involves active decisions that can underperform passive benchmarks for extended periods.

Questions about our methodology? Check the FAQ or reach out at [email protected].

See the methodology in action

Explore 43+ strategies with full backtest results, drawdown charts, and signal history. 6 are free forever.