Asset-Class Research Index

Multi-AI consensus research runs producing sourced + backtested + cross-tested strategy candidates per asset class. See tools/research/orchestrator.py + updates/2026-05-11-research-orchestrator-design.md for the 5-pass protocol.
Index generated: May 30, 2026 3:06 AM EDT  •  All run timestamps shown below are also EST/EDT; the run_* directory IDs are UTC.

How to read this page

Each run is one full 5-pass orchestrator cycle for one asset class: P1 literature → P2 candidate distillation → P3 backtest → P4 cross-test on out-of-sample fold → P5 synthesis. Click view → to drill into the per-pass artifacts (p1_literature.json, p2_candidates.json, …, p5_synthesis.md).

Columns: Candidates = strategies P2 produced; Backtested = passed structural sanity to enter P3; Independent = survived P4 cross-test on a held-out fold; Citations verified/total = how many P1 citations resolved to a real URL; Cost = LLM spend for that run.

Verdict legend: GO at least one candidate survived all 5 passes with Tier-2 floor stats (PF≥1.5, WR≥50, n≥100).   MIXED survivors exist but fail the Tier-2 floor on one axis — needs human triage.   NO_EDGE no candidate cleared cross-test.   PENDING in flight.   ABORTED crashed before completing the 5 passes.

bond

📚 Lessons Learned (Bond Research)
  • PF=0.0, WR=0.0%, n=5 - no profitability and too few trades; cross-test shows ind
  • PF=0.0, WR=0.0%, n=5 - no edge; simplified SMA proxy likely mis-represents inten
  • PF=0.0, WR=0.0%, n=5 - no statistical edge; regime filter and risk-parity sizing
  • PF=0.0, WR=0.0%, n=5 - flat performance; simplified signal likely mis-aligned wi
  • PF=1.51 meets Tier-2 PF but WR=33.3% and n=9 are far below requirements; cross-t

  • Key takeaway: Out of 12 runs, verdicts were mostly MIXED: 5, NO_EDGE: 7. Bond strategies produce many theoretical candidates (14+) yet nearly zero clear edges — likely because bond ETFs have lower turnover and tighter spreads, limiting momentum/carry signal strength.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-05-27Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-30T07-05-21Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-16T06-51-00Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-16T06-50-48Z
    May 16, 2026 2:50 AM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T19-34-52Z
    May 11, 2026 3:34 PM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T19-33-00Z
    May 11, 2026 3:33 PM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T19-31-34Z
    May 11, 2026 3:31 PM EDT
    NO_EDGE 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T19-30-55Z
    May 11, 2026 3:30 PM EDT
    MIXED 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T19-01-40Z
    May 11, 2026 3:01 PM EDT
    MIXED 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T18-59-11Z
    May 11, 2026 2:59 PM EDT
    MIXED 14 14 14 8 / 20 $0.00 view →
    run_2026-05-11T18-56-26Z
    May 11, 2026 2:56 PM EDT
    MIXED 3 3 3 8 / 20 $0.00 view →
    run_2026-05-11T17-46-50Z
    May 11, 2026 1:46 PM EDT
    MIXED 3 3 3 0 / 3 $0.00 view →

    commodity

    📚 Lessons Learned (Commodity Research)
  • PF=10.68 but only 4 trades
  • PF=10.68 with n=4 trades; trade count far below Tier-2 floor, making the backtes
  • PF=10.68 but only 4 trades; n too low for robust inference despite attractive me
  • PF=2.25, WR=58.3% but only 12 trades; n<100 violates Tier-2 floor, so edge is no
  • PF=2.66, WR=64.7% yet only 17 trades; insufficient history to validate performan

  • Key takeaway: Out of 7 runs, verdicts were mostly MIXED: 1, NO_EDGE: 6. Commodity research generated diverse candidates (10 per run) but ALL failed — driven by very limited trading history where most strategies see only 4..17 trades, far below statistical confidence thresholds.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-06-24Z
    May 30, 2026 3:06 AM EDT
    NO_EDGE 10 10 10 6 / 20 $0.00 view →
    run_2026-05-30T07-06-21Z
    May 30, 2026 3:06 AM EDT
    NO_EDGE 10 10 10 6 / 20 $0.00 view →
    run_2026-05-16T06-52-25Z
    May 16, 2026 2:52 AM EDT
    NO_EDGE 10 10 10 6 / 20 $0.00 view →
    run_2026-05-16T06-52-19Z
    May 16, 2026 2:52 AM EDT
    NO_EDGE 10 10 10 6 / 20 $0.00 view →
    run_2026-05-11T19-35-10Z
    May 11, 2026 3:35 PM EDT
    NO_EDGE 10 10 10 6 / 20 $0.00 view →
    run_2026-05-11T19-32-58Z
    May 11, 2026 3:32 PM EDT
    NO_EDGE 10 10 10 6 / 20 $0.00 view →
    run_2026-05-11T19-18-00Z
    May 11, 2026 3:18 PM EDT
    MIXED 10 10 10 6 / 20 $0.00 view →

    crypto

    📚 Lessons Learned (Crypto Research)
  • Backtest shows PF=2.86 but only 5 trades, MDD=37.1% >20% and WR=60% with a simpl
  • Same metrics as other candidates
  • PF looks attractive but only 5 trades, MDD exceeds Tier-2 limit and signal trans
  • Metrics identical to others with n=5 and MDD=37.1%; trade count too low for reli
  • Backtest PF=2.86 based on only 5 trades; high drawdown and simplified signal mea

  • Key takeaway: Out of 7 runs, verdicts were mostly MIXED: 1, NO_EDGE: 6. Crypto strategies consistently hit the trade-count wall (n < 30); higher volatility means more opportunities but also wider MDD swings that trip the 20% floor.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-05-59Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 16 16 16 15 / 20 $0.00 view →
    run_2026-05-30T07-05-53Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 16 16 16 15 / 20 $0.00 view →
    run_2026-05-16T06-51-46Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 16 16 16 15 / 20 $0.00 view →
    run_2026-05-16T06-51-37Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 16 16 16 15 / 20 $0.00 view →
    run_2026-05-11T19-35-02Z
    May 11, 2026 3:35 PM EDT
    NO_EDGE 16 16 16 15 / 20 $0.00 view →
    run_2026-05-11T19-32-51Z
    May 11, 2026 3:32 PM EDT
    NO_EDGE 16 16 16 15 / 20 $0.00 view →
    run_2026-05-11T19-17-50Z
    May 11, 2026 3:17 PM EDT
    MIXED 16 16 16 15 / 20 $0.00 view →

    equity

    📚 Lessons Learned (Equity Research)
  • Backtest shows very high PF
  • Identical performance to the momentum ETF

  • Key takeaway: Out of 7 runs, verdicts were mostly MIXED: 1, NO_EDGE: 6. Equity research shows strong cross-test independence (ρ ≈ 0.95) which suggests ideas hold out-of-sample, but sample sizes remain the primary blocker to confident deployment.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-05-48Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 5 5 5 14 / 30 $0.00 view →
    run_2026-05-30T07-05-44Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 5 5 5 14 / 30 $0.00 view →
    run_2026-05-16T06-51-31Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 5 5 5 14 / 30 $0.00 view →
    run_2026-05-16T06-51-22Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 5 5 5 14 / 30 $0.00 view →
    run_2026-05-11T19-34-59Z
    May 11, 2026 3:34 PM EDT
    NO_EDGE 5 5 5 14 / 30 $0.00 view →
    run_2026-05-11T19-32-48Z
    May 11, 2026 3:32 PM EDT
    NO_EDGE 5 5 5 14 / 30 $0.00 view →
    run_2026-05-11T19-17-46Z
    May 11, 2026 3:17 PM EDT
    MIXED 5 5 5 14 / 30 $0.00 view →

    etf

    📚 Lessons Learned (Etf Research)
  • Backtest shows PF=77.12, WR=80%, MDD=17% - attractive but only 5 trades
  • PF=1.97, WR=61.5%, MDD=19.7% meets the floor on PF/WR/MDD but only 13 trades; st
  • Identical backtest stats to the carry-yield model
  • Backtest shows PF=77.12, WR=80%, MDD=17% but only 5 trades; the strategy's simpl

  • Key takeaway: Out of 7 runs, verdicts were mostly MIXED: 7. ETF sector research showed mixed results — some carry/value ideas survived initial screening but all lacked sufficient trade count to validate robustness.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-06-06Z
    May 30, 2026 3:06 AM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →
    run_2026-05-30T07-06-03Z
    May 30, 2026 3:06 AM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →
    run_2026-05-16T06-51-59Z
    May 16, 2026 2:51 AM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →
    run_2026-05-16T06-51-54Z
    May 16, 2026 2:51 AM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →
    run_2026-05-11T19-35-05Z
    May 11, 2026 3:35 PM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →
    run_2026-05-11T19-32-53Z
    May 11, 2026 3:32 PM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →
    run_2026-05-11T19-17-53Z
    May 11, 2026 3:17 PM EDT
    MIXED 5 5 5 6 / 20 $0.00 view →

    forex

    📚 Lessons Learned (Forex Research)
  • Backtest PF=1.61 and WR=56.7% look promising, but only 30 trades were observed a
  • PF=0.7, WR=41.7% and Sharpe negative already fail Tier-2; with just 12 trades an
  • Identical performance to the previous FXA candidate
  • PF=1.61 and WR=56.7% meet the PF/WR thresholds but only 30 trades were recorded;
  • While PF=1.61, WR=56.7% look decent, the strategy has only 30 trades and the ent

  • Key takeaway: Out of 7 runs, verdicts were mostly MIXED: 1, NO_EDGE: 6. Forex strategies consistently produced cross-test independent signals yet fell short of Tier-2 floors, suggesting the pair-selection logic works in principle but needs stronger directional filters.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-05-39Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 5 5 5 8 / 20 $0.00 view →
    run_2026-05-30T07-05-34Z
    May 30, 2026 3:05 AM EDT
    NO_EDGE 5 5 5 8 / 20 $0.00 view →
    run_2026-05-16T06-51-15Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 5 5 5 8 / 20 $0.00 view →
    run_2026-05-16T06-51-10Z
    May 16, 2026 2:51 AM EDT
    NO_EDGE 5 5 5 8 / 20 $0.00 view →
    run_2026-05-11T19-34-55Z
    May 11, 2026 3:34 PM EDT
    NO_EDGE 5 5 5 8 / 20 $0.00 view →
    run_2026-05-11T19-32-46Z
    May 11, 2026 3:32 PM EDT
    NO_EDGE 5 5 5 8 / 20 $0.00 view →
    run_2026-05-11T19-14-54Z
    May 11, 2026 3:14 PM EDT
    MIXED 5 5 5 8 / 20 $0.00 view →

    futures

    📚 Lessons Learned (Futures Research)
  • PF 3.5 meets the floor but WR 40% < 50%, MDD 33.5% > 20% and only 10 trades were
  • PF 3.5 passes, yet WR 40% and MDD 33.5% violate Tier-2 thresholds and the backte
  • Despite PF 3.5, WR 40% and MDD 33.5% fall short, and the sample size is only 10
  • PF 3.5 is adequate but WR 40% and MDD 33.5% breach Tier-2 limits; only 10 trades
  • PF 3.5 meets the floor, yet WR 40% and MDD 33.5% are outside acceptable ranges;

  • Key takeaway: Out of 7 runs, verdicts were mostly MIXED: 1, NO_EDGE: 6. Futures research produced the most NO_EDGE verdicts — commodity futures require specific regime awareness (contango/backwardation) that simple trend-following proxies miss entirely.
    RunVerdictCandidatesBacktestedIndependentCitations verified/totalCostPage
    run_2026-05-30T07-06-16Z
    May 30, 2026 3:06 AM EDT
    NO_EDGE 10 10 10 8 / 20 $0.00 view →
    run_2026-05-30T07-06-11Z
    May 30, 2026 3:06 AM EDT
    NO_EDGE 10 10 10 8 / 20 $0.00 view →
    run_2026-05-16T06-52-12Z
    May 16, 2026 2:52 AM EDT
    NO_EDGE 10 10 10 8 / 20 $0.00 view →
    run_2026-05-16T06-52-05Z
    May 16, 2026 2:52 AM EDT
    NO_EDGE 10 10 10 8 / 20 $0.00 view →
    run_2026-05-11T19-35-07Z
    May 11, 2026 3:35 PM EDT
    NO_EDGE 10 10 10 8 / 20 $0.00 view →
    run_2026-05-11T19-32-55Z
    May 11, 2026 3:32 PM EDT
    NO_EDGE 10 10 10 8 / 20 $0.00 view →
    run_2026-05-11T19-17-56Z
    May 11, 2026 3:17 PM EDT
    MIXED 10 10 10 8 / 20 $0.00 view →
    Disclaimer: This is NOT financial advice. All trading signals, picks, scores, and analysis are for educational and research purposes only. Past performance does not guarantee future results. Trading cryptocurrencies involves substantial risk of loss. Always do your own research (DYOR) before making any investment decisions.