🔬 Pick Funnel · /audit

Every pick scanned → score floor → trust gate → high-conviction → opened → closed → win/loss. Discovery only — green holdout cells are candidates, not capital-ready. Size only from money_ready_verdict.json (policy-clean).

Loading…
What do these terms mean? (click to expand)
WR (shrunk)
Beta-shrunk Bayesian win rate — instead of raw wins/n (which makes 5-of-7 look like a 71% edge), we compute (wins + 0.5*20) / (n + 20). This pulls small-n cells toward a 50% prior so noise cells don't look like real edges. Source: tools/audit_pick_funnel/top_edges.py::bayes_wr().
PF
Profit Factor = sum(positive pnls) / abs(sum(negative pnls)). 1.0 = break-even. ≥1.5 = solid edge. ≥3.0 = elite (always sanity-check n and leakage). Cell PF<1 means losers outweighed winners even if WR was high.
n
Closed-trade sample size in the cell (WIN+LOSS only — flats/expired excluded). PROVEN tier requires n≥20.
fam
Strategy family — bucketed from the strategy column by keyword: scalp, breakout, momentum, mean_reversion, trend, vol, consensus. cftc = strategies with "cftc" in the name (Commodity Futures Trading Commission positioning signals — commercial-hedger COT data). cot = sibling COT-reading strategies under different naming.
conf band
Confidence quantile at pick time. C0.60-0.70 = model confidence between 0.60 and 0.70. Bands above: C0.70-0.75, C0.75-0.80, C0.80-0.85, C0.85-0.90, C>=0.90.
rr band
Reward-to-risk ratio = (tp-entry)/(entry-sl) for LONG (flipped for SHORT). RR1.0-1.5 = 1.0-1.5× payoff trade. Bands: RR<1.0, RR1.0-1.5, RR1.5-2.0, RR>=2.0.
avg PnL
Arithmetic mean of per-trade pnl_pct across the cell. Small avg PnL + big PF is normal and not suspicious — see the COMMODITY cftc example: 70% WR with ~0.05% wins vs ~0.04% losses gives PF≈3.3 on a 0.026% mean — high-frequency scalp profile, not "fake edge".
trust
Strategy track-record tier: PROVEN (≥80), DEVELOPING (60-79), WATCH (40-59), SANDBOX (20-39), PROBATION (<20), UNK (no prior).
cell
A pick "edge cell" is a 3-tag intersection (e.g. conf × rr × fam) over 90d closed picks. PROVEN cells (WR_shrunk≥55% & PF≥1.5 & n≥20) keep getting capital; "good WR / bad PF" cells win often but lose big.
⚠ DISPUTED — Smart Picks · CRYPTO row (n=337, WR=78.9%, PF=9.69)
Independent verification on 2026-05-25 against ejaguiar1_stocks.at_raw_picks (90d) shows the raw DB-side CRYPTO closed-decisive WR is 39.4% (n=2001, PF=0.371) — directly opposite the dashboard cohort. The 337-row cohort behind the Smart Picks CRYPTO row is dominated by a single source/strategy (claude_gainer_st = 309/337 = 91.7%; st_fear_greed_contrarian = 236/337 = 70.0%) and counts 97 EXPIRED picks at 63.9% WR (EXPIRED picks shouldn't be ~64% WR; this looks like intraday-drift being labelled WON). Holdout PF=71.16 is implausible. The Verified Alpha (n=304, WR=83.2%), High-Conviction (n=231, WR=87.9%), and ELITE-display-tier (n=155, WR=87.7%) CRYPTO rows below share the same cohort and should be treated as DISPUTED until the EXPIRED→WON labelling logic is fixed and the single-source concentration is mitigated. See reports/2026-05-25_crypto_78pct_wr_verification.md.
Resolution status: Two original complaints are obsolete — claude_gainer_st concentration dropped from 91.7% → 0% (source no longer emits decisive CRYPTO picks), and EXPIRED→WON mislabeling dropped from 63.9% → 0.1% (fix landed). Raw CRYPTO 90d WR recovered to ~42% with n=7000+ (from n=2001 at time of original dispute). The historical 337-row Smart Picks cohort (WR=78.9%) remains untracked by raw DB. Use the Smart Picks DB Ground Truth table below for current honest metrics.

Smart Picks — DB Ground Truth (90d) DB direct

Direct query against ejaguiar1_stocks.picks with Smart Picks gate (elite_score ≥ per-class floor + confidence ≥ 0.60). Decisive only (WON+LOST). Contrast with the nav surface matrix above — if the matrix shows 78.9% WR for CRYPTO Smart Picks but this table shows 0 picks, the nav surface is fabricated from EXPIRED-mislabeled-WON and single-source concentration.

Loading smart_picks_db_stats from pick_funnel_90d.json…

Last 48 Hours — Per Asset Class [48h]

What ACTUALLY happened on the live `at_raw_picks` table in the last 48 hours — per asset class rollup + the per-pick detail rows. This is the cohort to trust most: zero curation lag, zero noise from the historical pool. If a class shows INSUFFICIENT n here while the all-time row shows 90% WR, the all-time row is a relic.

Loading pick_summary_stats_48h.json…

Navigation-surface Edge Matrix

For each clickable filter / tab on /audit (Verified Alpha, Smart Picks, Money Ready, High-Conviction, the US-Equity sub-tabs, ELITE display tier) we compute — per asset class on the 90d closed pool — n, raw WR%, Bayesian-shrunk WR%, PF, the train/holdout PF on a chrono 60/40 split, whether the cell passes a Bonferroni-corrected significance test (α=0.05/8 surfaces), and an auto-generated why-no-edge string (small_n / overfit / concentration / source_quality / TIME_EXIT_inflation). Verdict at the top of each surface = edge if at least one asset-class row passes both holdout (PF≥1.2) and Bonferroni, otherwise no-edge.

Today's Picks (last 24h)

Each row is a pick our scanners produced in the last 24h. AGV=alpha-gravity raw score; SCORE=elite_score (gate-deciding); TRUST=strategy track-record; CONF=model confidence. The Gate column shows the highest gate the pick cleared — HC = high-conviction (score≥80, conf≥0.75, trust≥60), VA = verified-alpha (score≥70), SMART = passed per-class smart floor, FAIL = none. If FAIL, hover the row for the reason.
SymbolDirClassSystemStrategy SCORETRUSTCONF EntryTPSL StatusGate

90-Day Per-Class Funnel

For each asset class over 90 days: how many picks were SCANNED → how many passed the SMART gate (per-class score floor + conf≥0.60) → how many cleared HIGH-CONVICTION → how many actually OPENED → how many closed as WIN or LOSS. WR shown is decisive-only (WIN/(WIN+LOSS)), so flat/expired picks don't dilute it. CRYPTO and COMMODITY dominate volume; FOREX is the largest noise sink.
Class ScannedSmartVA HCProven OpenedClosed WinLossDecisive WR%

Rejected from Universe (today)

These are symbols in our scanned universe that did NOT produce a SMART-grade pick today. Not scanned = no row appeared at all (price feed, blackout, or universe-filter killed it earlier). Gate rejected = scanner emitted a candidate but the per-class score/conf/trust floor killed it before /audit could surface it. Drilling in: most equity rejects are below the SMART_PICKS_MIN_SCORE_EQUITY=60 floor.
14-day rejection counts [14d]

14-day per-class touched / active / closed counts from at_raw_picks. A class with high "touched" but low "decisive" in 14d means the gate is now rejecting most candidates — useful sanity-check against the "today" rejected universe above.

Loading…

Top Edges per Class (90d)

"Edge cells" are combinations of pick-tags (trust tier, confidence band, R:R band, strategy family, direction, source) that — over the last 90 days — historically delivered WR ≥ 55% (Bayesian-shrunk to discount small-n) + PF ≥ 1.5 + n ≥ 20 closed. That's the PROVEN definition. PROVEN cells should keep getting capital; "good WR, bad PF" cells win often but lose big when they're wrong (don't size up); "best PF overall" can flag suspiciously-good cells that deserve a manual sanity check for leakage.
14-day per-class WR / PF [14d]

Per-class rollup over the last 14 days (Bayes-shrunk, dedup-aware, single-source-concentration flagged). Compare with the all-time "PROVEN edges" above — a cell that's PROVEN over 90d but absent here means the edge stopped firing OR stopped winning.

Loading…

Swarm Verdict (deepseek + cerebras + gemini)

We piped the top-edges JSON to 3 outside models and asked: "is this edge real or noise; if we put $100k @ 1% risk last 90d, what's the P&L; what one gate change lifts it most?". Consensus below. Models disagree on CRYPTO (cerebras = noise, deepseek = noise+leakage suspected, gemini = mixed) and agree on COMMODITY = real edge.

Read full report: research index · reports/2026-05-25_pick_funnel_swarm_verdict.md

Loading…
14-day reality check on the swarm verdict [14d]

The swarm verdict above used the 90d/all-time cohort. Below is the 14d ground truth from at_raw_picks. If the swarm called "COMMODITY = real edge" but 14d-COMMODITY now shows PF<1, the verdict is stale.

Loading…

Your Session — Click Funnel (local)

This panel shows your own clicks on the High Conviction and Money Ready buttons on /audit, captured to localStorage in this browser only (no network, no PII). For each click we record the asset-class filter, how many picks were visible, and the top 3 symbols at click time — so you can see what you were looking at when you triggered each preset. Clear it any time with the button below.
No clicks yet — visit /audit, click HIGH CONVICTION or MONEY READY, then refresh this page.
Your 14d clicks vs 14d performance [14d]

Of your recent clicks (local), how many landed on asset classes whose 14d cohort actually has edge? Filtered to the last 14 days of clicks.

No clicks in the last 14 days.

Active + Closed Picks Summary (per asset class)

Three views per asset class, side-by-side, so divergence is visible at a glance: Why two sources? The dashboard pre-filters/dedups; the DB is raw. Big gap = the dashboard is showing a curated slice (which may carry bias). For the suspicious CRYPTO Smart cell (n=337, WR 78.9%) the DB equivalent (any source, conf≥0.60, 90d closed) is WR 33% — that's a 46-point gap and the reason for the DISPUTED banner above.
WRshrunk uses a Beta(α=10, β=10) prior (prior_n=20, prior_mean=0.5) — pulls small-n cells toward 50%. PF < 1 means losers outweighed winners. Mean R is mean per-trade pnl_pct (not R-multiples). Max DD is peak-to-trough on a chrono cumulative-pnl_pct equity curve over the cohort.

Loading pick_summary_stats.json…

Same table, restricted to the last 14 days [14d]

Per-class summary from data/pick_summary_stats_2w.json — picks opened OR closed in the last 14 days only. Shrinkage prior matches the all-time table above.

Loading…

Same table, restricted to the last 48 hours [48h]

Per-class summary from data/pick_summary_stats_48h.json — picks opened OR closed in the last 48 hours only. Cells with n<10 decisive trades show "INSUFFICIENT n" rather than fabricating a WR.

Loading…

Strategy Funnel — Performance by View & Asset Class

This section shows the performance of every tracked strategy across all asset classes, with time-window metrics (all-time, 7d, 14d, 30d, 48h). It also compares pick funnel views (e.g. Smart Picks button vs Smart Picks tab) to surface any performance divergence. Sourced from data/strategy_funnel_data.json which is populated from the ejaguiar1_stocks database tables: strategy_summary, pick_funnel_views, edge_discovery, metric_dimensions.

Loading strategy_funnel_data.json…

Loading view performance…

Loading edge discovery…