Backtest Methodology
BestFolio aggregates published tactical asset allocation strategies and backtests them under a consistent, transparent framework. This page explains exactly how our backtests are constructed so you can evaluate the results with full context.
1. Data Sources
- Price data: Sourced from Yahoo Finance via
yfinance, using adjusted close prices that account for stock splits and dividend distributions. - Total return: All returns assume dividends are reinvested at the time of distribution.
- Frequency: Daily prices are fetched and resampled to monthly frequency for signal computation. Daily granularity is retained for drawdown and NAV calculations.
- Refresh cycle: Price data is refreshed daily via an automated pipeline. Signals are recomputed at the end of each month.
2. Backtest Mechanics
- Rebalancing: Monthly, on the last trading day of each calendar month.
- Signal timing: Signals are computed using end-of-month closing prices. The resulting allocation is applied to the following month. For example, a signal dated February 28 uses complete February data and determines what you hold during March.
- No look-ahead bias: Signals use only data that was available at the time of computation. No future data leaks into past decisions.
- Starting NAV: Normalized to $100 at the beginning of each backtest.
3. Transaction Costs & Slippage
All backtests include realistic transaction cost modeling. Costs are not set to zero — they are deducted from portfolio returns at every rebalance.
- Base cost: 10 basis points (0.10%) per one-way trade, applied proportionally to portfolio turnover at each rebalance. For example, a full 100% portfolio rotation incurs 0.10% in costs; a partial 30% rotation incurs 0.03%.
- Stress-adjusted slippage: During periods of elevated volatility, spreads widen and execution quality degrades. Our engine dynamically scales transaction costs by up to 3× the base rate when recent 20-day realized volatility exceeds the baseline (~16% annualized). This means backtests already account for the higher friction you would experience during market stress (e.g., March 2020, Q4 2018).
- Turnover tracking: Every backtest reports total transaction costs, annual turnover, and average trades per year in the Summary Statistics panel. You can evaluate friction impact directly.
- What is not included: Taxes, margin costs, fund expense ratios (already reflected in ETF NAV), and behavioral factors (delayed rebalancing, panic selling) are not modeled. These vary by investor and jurisdiction.
4. Survivorship & Selection Bias
- Source: All strategies are sourced from published academic papers, practitioner research, or well-documented public methodologies.
- Faithful implementation: We implement each strategy as described by its original author(s). No proprietary optimization or curve-fitting is applied on top of published rules.
- Selection transparency: We deliberately select well-known, publicly documented strategies rather than cherry-picking private or unpublished ones. This introduces a form of selection bias — we are showing strategies that gained attention, which may correlate with strong historical performance.
5. Synthetic / Proxy Data (Fallback Chain)
Many ETFs have limited history. To produce longer backtests, we use proxy data for periods before an ETF existed:
- Fallback chain: Each ticker has a documented hierarchy of proxies. When the primary ETF has no data for a given date, the engine walks down the chain until it finds a valid source. Price continuity is maintained by scaling at each join point.
- Transparency: The full fallback chain for every ticker in a strategy is visible on the strategy detail page under the “Price Build Log” section.
Key substitution chains (100+ tickers covered):
- S&P 500: SPY (1993) → VFINX (1980)
- Nasdaq 100: QQQ (1999) → ^NDX index (1985)
- US Small Cap: IJR (2000) → NAESX (1990); IWM (2000) → ^RUT index (1987)
- US Mid Cap: IJH (2000) → VIMSX (1998) → ^MID index (1981)
- US REITs: VNQ (2004) → VGSIX (1996) → FRESX (1986)
- Int'l Developed: EFA (2001) → VGTSX (1996) → PRTIX (1989)
- Emerging Markets: EEM (2003) → VEIEX (1994) → FEMKX (1990)
- Long Treasuries: TLT (2002) → VUSTX (1986)
- Intermediate Treasuries: IEF (2002) → VFITX (1991)
- Short Treasuries: SHY (2002) → VFISX (1991)
- T-Bills: BIL (2007) → ^IRX yield (1970)
- TIPS: TIP (2003) → VIPSX (2000) → PRTNX (1997) → VFITX (1991)
- Gold: GLD (2004) → GC=F futures (2000) → London fixing (1968)
- Commodities: DBC (2006) → PCRIX (2002) → ^SPGSCI (1990)
- Aggregate Bonds: AGG (2003) → VBMFX (1986)
- High Yield: HYG (2007) → VWEHX (1980)
- Managed Futures: KMLM (2020) → RYMFX (2007) → AQR TSMOM (1985)
- Leveraged ETFs: Synthetic daily-leveraged returns from underlying, with expense ratio and borrowing cost deductions
6. Walk-Forward Validation
- Availability: Offered for strategies whose rules include tunable parameters.
- Method: A rolling window approach — train on N months of history, test on the next M months, then slide the window forward and repeat.
- Purpose: Prevents overfitting to in-sample data by validating that a strategy’s edge persists out-of-sample.
- Identification: Strategies with walk-forward validation results are marked accordingly on their detail pages.
7. Metrics Computation
All metrics are computed from the monthly NAV series unless otherwise noted.
| Metric | Definition |
|---|---|
| CAGR | Compound Annual Growth Rate. Annualized geometric return derived from the full NAV series. |
| Sharpe Ratio | (Mean monthly excess return × √12) / (Std of monthly returns × √12). Measures risk-adjusted return per unit of total volatility. |
| Sortino Ratio | Same as Sharpe but substitutes downside deviation (volatility of negative returns only) for standard deviation. Penalizes downside risk without penalizing upside volatility. |
| Max Drawdown | Largest peak-to-trough decline in NAV over the full backtest period. Expressed as a percentage. |
| SWR | Safe Withdrawal Rate. The maximum constant, inflation-adjusted annual withdrawal rate that survives every rolling 30-year window in the backtest. |
| PWR | Perpetual Withdrawal Rate. The maximum withdrawal rate that preserves real (inflation-adjusted) capital over every rolling 30-year window. |
| UPI | Ulcer Performance Index. CAGR divided by the Ulcer Index (a measure of drawdown depth and duration). Higher values indicate better risk-adjusted performance with emphasis on drawdown pain. |
8. Limitations & Disclaimers
- Hypothetical results: All backtested performance is hypothetical. It does not represent actual trading and was not achieved with real capital.
- No guarantee: Past performance — whether backtested or live — does not guarantee future results.
- Remaining frictions: While transaction costs and stress-adjusted slippage are modeled (see Section 3), taxes, margin costs, fund expense ratios, and behavioral factors (panic selling, delayed rebalancing) are not captured.
- Execution assumption: Monthly rebalancing assumes execution at the closing price on the last trading day. Intraday or next-day execution will produce different results.
- No universal winner: No strategy is guaranteed to outperform a simple buy-and-hold approach in all market environments. Tactical allocation involves active decisions that can underperform passive benchmarks for extended periods.
Questions about our methodology? Check the FAQ or reach out at [email protected].