Generated: May 24, 2026 ยท 3 debate rounds ยท 10+ AI models ยท Data: tournament_picks (3,149 picks, 34 models)
Round 1: Initial Pick Debate (7 Models)
Models: Risk Manager, Portfolio Manager, Cerebras GPT-OSS-120B, DeepSeek V4, KiloCode, Kimi, Cursor
Consensus Top 5
| Rank | Pick | Direction | Entry | Models Agreeing |
| 1 | TLT (Bond) | LONG | $87.66 | 7/7 |
| 2 | SPY (ETF) | SHORT | $726.80 | 7/7 |
| 3 | GLD (ETF) | LONG | $257.93 | 7/7 |
| 4 | CL=F (Commodity) | SHORT | $68.25 | 5/7 |
| 5 | XOM (Equity) | SHORT | $116.91 | 5/7 |
Key Findings
- VETOED: MVST (0 data), MSFT#2 (RR<1), SHY (coin flip with costs)
- SYSTEMIC: "80% confidence epidemic" โ n=23 cannot support 80% confidence claims
- SYSTEMIC: Duplicate MSFT entries โ no position dedup in pipeline
- SYSTEMIC: No n-threshold gate โ 0-data personas generating live signals
- FIX NEEDED: Confidence calibration, minimum n=50 gate, RR>=0.8 filter, duplicate detection
Swarm Prompts Used
Click to expand
# Pick Debate Prompt (fed to 7 engines via swarm_run.py + subagents)
You are one of 7 independent AI models evaluating the same curated pick list.
You MUST respond in your own voice. Do not defer to other models.
## The Picks (Forward-Tested Only, tournament_picks DB)
EQUITY: MSFT LONG 447.89 (WR 58%, n=164), XOM SHORT 116.91 (WR 58%, n=164)
CRYPTO: AVAXUSDT SHORT 22.83 (WR 65%, n=23), SOLUSDT LONG 157.39 (WR 65%, n=23)
ETF: SPY SHORT 726.80 (WR 65%, n=23), GLD LONG 257.93 (WR 65%, n=23)
COMMODITY: CL=F SHORT 68.25 (WR 65%, n=23), NG=F SHORT 3.90 (WR 65%, n=23)
BOND: TLT LONG 87.66 (WR 65%, n=23), SHY SHORT 80.27 (WR 58%, n=164)
PENNY: MVST LONG 1.80 (WR 0%, n=0)
TASK: Rank YOUR top 5 picks. Veto 1-2 picks. Flag systemic issues.
Model-Specific Insights
Risk Manager: "XOM SHORT is the most honest pick โ 30% confidence on 58% WR, good RR 2.1. The system tells you it's uncertain. That's a feature."
Portfolio Manager: "Top 5 trim: TLT, SPY, GLD, CL=F, AVAXUSDT โ dropping the conflicting SOL, redundant XOM, seasonal NG, low-conviction SHY/MSFT."
Cerebras (gpt-oss-120b): "Absence of minimum-sample-size gate. A simple rule: require 50 observations for WR>60% would reduce false positives."
DeepSeek (v4-flash): "The n-threshold is applied inconsistently. Equity n=164 vs crypto n=23 โ not statistically equivalent."
Round 2: Full Cross-Asset + IPO/Mutual Fund Gap Analysis
Models: Multi-Asset Allocator, Financial Data Architect
Expanded Pick List (All 9 Asset Classes)
| Class | #1 | #2 | #3 | Status |
| EQUITY | PG SHORT $167.37 | GOOGL LONG $186.63 | META LONG $620.71 | ACTIVE |
| CRYPTO | SOLUSDT LONG $157.39 | ETHUSDT SHORT $2150 | BTCUSDT SHORT $78,626 | ACTIVE |
| ETF | SPY SHORT $726.80 | GLD LONG $257.93 | XLE SHORT $91.74 | ACTIVE |
| COMMODITY | SI=F SHORT $34.93 | CL=F SHORT $68.25 | โ | ACTIVE |
| BOND | TLT LONG $87.66 | BND LONG $71.47 | SHY SHORT $82.36 | ACTIVE |
| PENNY | MVST $1.80 | KULR $2.20 | QBTS $1.50 | 0 RESOLVED |
| FUTURES | ES=F LONG 5600 | GC=F LONG 2500 | CL=F SHORT 73 | 0 RESOLVED |
| FOREX | ๐ด BLOCKED โ Kill gate: 57.3% WR, -0.39% avg PnL | BLOCKED |
| IPO/Mutual Funds | โณ ZERO DATA โ Infrastructure not built | GAP |
Multi-Asset Allocator: Top 10 Composite Scores
| Rank | Pick | Score | WR | n | RR |
| 1 | SPY SHORT | 4.59 | 62.5% | 124 | 1.9 |
| 2 | TLT LONG | 4.35 | 62.5% | 124 | 1.8 |
| 3 | SHY SHORT | 3.86 | 62.5% | 124 | 1.6 |
| 4 | GLD LONG | 2.90 | 62.5% | 124 | 1.2 |
| 5 | WMT SHORT | 2.67 | 64.0% | 11 | 2.1 |
Score = Confidence ร WR ร RR ร ln(n+1), normalized per class.
IPO/Mutual Fund Gap Analysis
Financial Data Architect recommendations:
- IPO Stocks: SEC EDGAR (RSS feed for S-1 filings), Nasdaq IPO Calendar (RSS), StockAnalysis.com IPO page (scrapable). Build
tools/ipo_scraper.py (~200 lines). Persona: ipo_momentum for post-IPO 30-day window.
- Mutual Funds: Track Fidelity Zero funds (FZROX, FZILX, FZIPX), Schwab (SWTSX, SWISX), Vanguard (VTSAX, VTIAX). Score by expense ratio (zero = best), Sharpe ratio, factor tilt. Persona:
fund_quality_gate.
- Est. build time: 4-6 hours for IPO scraper + persona, 2-3 hours for mutual fund tracker.
Not financial advice. Educational/research purposes only. Data: tournament_picks table (ejaguiar1_stocks).
AI Tournament ยท Curated Picks