Play The Board

Scope: applies to top metrics + all breakdowns

📋 Today's Focus

P&L

—

Record

0-0

ROI

—

Wagered

—

Streak

—

Live Scores — Today's Bets

No active bets today.

Recent Plays

P&L Over Time

Performance by Bet Type

Performance by Tier × Odds Range

Props vs Parlays vs Straights

Model Divergence Performance does the model agreeing with the recommendation predict W/L?

Pattern Conviction Tiers A vs B vs C

Favorites vs Underdogs

Stake Ladder ROI by units bucket

Drawdown & Variance

Day of Week Heatmap

Monthly Breakdown

CLV Tracker closing line value

Model Calibration (OOS) predicted prob vs actual hit rate · ECE = mean miscalibration

Sized Calibration win rate + recommender-sized ROI by tier / edge band / top-N · juice-adjusted · auto-updates

Loading games...

Injury Report

Loading injury report...

Trending Headlines

Loading headlines...

Card Settings

Date

Title

Notes

Plays

AwayHomeBet OnTypeOddsLineUnitsTierRisk / Win

▶

📊 Past Tracked Props

PROP MODEL CALIBRATION

Live weights + per-signal performance · sample-gated validation

▸

Leg / BetOdds

Parlay Odds & Payout

Wager Amount

Add at least 2 legs to calculate

Quick Add from Today's Card

No plays on Today's Card yet

Tracked Parlays

Calibration research · no P&L impact · auto-grades when games go final

Current Confidence Ratings

Pattern Impact Dashboard real plays + research — per-pattern ROI and P&L

Pattern × Bet Type Matrix which patterns work best for each bet type

Calibration log + blend-weight backtests moved to Model · Calibration

Season Pattern Performance

📋 Research History — tracked outcomes

📊 Odds Range Research

Run Board Analyzer to start tracking.

🧬 Signal Composition

▶

What is EV?

Expected value = how much you win on average per $100 bet. Positive EV means the bet profits long-term.

Edge %

Your true probability minus the book's implied probability. Anything above 3% is a strong edge.

Kelly Criterion

Optimal bet size = edge / odds. Most pros use ¼ Kelly to reduce variance. Never bet full Kelly.

Break-even %

Win rate you need at given odds just to break even. -110 requires 52.4%. Your true prob must beat this.

Bet Analysis

Kelly fraction:

Game / BetBook OddsTrue Prob %Break-evenEdgeEV per $100Kelly UnitsVerdict

Reading your results

Strong play — bet it

Edge > 5% · EV > +$5/100 · Kelly > 2u. Book is giving you significantly better odds than your true probability justifies.

Marginal — consider it

Edge 2–5% · EV +$2–5/100. Worth betting at reduced size. Common on totals at -110 where your edge is real but narrow.

Negative EV — skip it

Edge < 0% · EV negative. Your true probability is lower than the implied odds. No matter how good the narrative feels, this loses long-term.

DATE

UNIT SIZE ($)

MAX EXPOSURE ($)

Ready to fetch today's games...

UNIT SIZE ($)

MAX EXPOSURE ($)

Edit to re-size the card live. The cap shapes displayed stake; tracking records full intended size.

⚡ Fetch games, enter odds, and click ANALYZE FULL SLATE

Load today's games first, then this tab will show line movement signals.

BOARD ANALYZER — v7.1 DAILY WORKFLOW
Baseball betting model · 41 live patterns · 158 weights · lineup cards · series context · travel fatigue · doubleheader-aware
DAILY STEPS
STEP 1FETCH TODAY'S GAMES

          Click "FETCH TODAY'S GAMES". Auto-loads the full MLB slate: confirmed starters, team records, streaks, rest days, bullpen data, stadium park factors, and live weather for every outdoor game. Also fetches: lineup cards with individual batter OPS/ISO, series W/L records, travel timezone data, and real umpire over% stats. Takes ~10–15 seconds. Games restore from cache if you've already fetched today.
STEP 2RATE PITCHER QUALITY

          For each game, set the starter tier using the ACE / SOLID / BACKEND / AUTO buttons:

          · ACE — Elite arm: ERA <3.20, K/9 9+, consistent track record

          · SOLID — Reliable mid-rotation: ERA 3.20–4.50

          · BACKEND — 5th starter, opener, or replacement level

          · AUTO — Let the model infer from fetched ERA/WHIP/K9 stats

          This is the only step requiring judgment. Takes ~2 min for a 15-game slate.
STEP 3ENTER ODDS

          Type in current book odds for each game. Only fill what's available — blank fields are skipped:

          · ML Away / ML Home (moneyline)

          · RL Away +1.5 / RL Home -1.5 (run line)

          · Total line + Over odds / Under odds

          · F5 Under odds (first 5 innings under)

          Odds auto-save as you type. Tip: enter ML first — the model uses ML odds as the primary edge anchor.
STEP 4ANALYZE FULL SLATE

          Click "ANALYZE FULL SLATE". The model runs all 4 bet types (ML, RL, Total, F5) across every game, applies 157 weighted factors, fires all 28 patterns, auto-detects Pinnacle sharp signals (RLM · Steam · S6 · Flip), enforces hard caps, and outputs a ranked slate in ~1 second.


          Results appear in two tabs:

          · 📋 CARD — Your top 7 plays by edge (D8 cap enforced). These are your actual bets.

          · 📊 ALL ANGLES — Every positive-edge play across all games, no cap. Use for research and pattern tracking.
STEP 5SEND TO CARD + LOG RESULTS

          Click + CARD on any play to add it to Today's Card with pattern tags automatically attached.

          After games finish, go to Log Results → set each play W / L / P.

          Pattern tallies update automatically when you grade — no manual scoring needed.

          Locked cards move to History. Pattern Tracker updates in real time.
READING THE ANALYSIS OUTPUT
TRUE PROB — Model's estimated win/cover probability after all factors applied.
IMPLIED — Book's implied probability from the odds you entered. Edge = True − Implied.
EDGE % — The gap between model and market. Plays below +3% edge are suppressed.
REC STAKE — Units recommended by edge tier. Capped by D4 (April/May) and D8 (7-play max).
½ KELLY — Half-Kelly dollar stake based on bankroll setting. Reference only — not required.
PREDICTED RUNS — Model's projected total. "Model agrees" = model and book align. "Model diverges" = model sees different value.
FACTOR TAGS — Green = adds to true prob, Red = subtracts. Each tag shows the exact % contribution.
BP SECTION — High-leverage arms for both bullpens with TAXED / WARM / FRESH status and pitch counts.
UNIT TIERS + HARD CAPS
Edge → Units
≥16% edge → 5-6u 🔥🔥 MAX (data-driven, scaling to 8-10u when 30+ plays validate)
12–16% → 3.5u 🔥 FIRE (49% WR — capped, below breakeven)
6–12% → 3.5u ✅ SOLID (45% WR — capped, below breakeven)
4–6% → 4-5u ✅ PLAY (72% WR, best ROI tier)
6–7% → 5u SOLID
4–5% → 4u PLAY
2.5–3.9% → 3u MARG
1–2.4% → 2u THIN
Hard Caps (always enforced)
D4 — Totals cap: Apr ≤20 → 6u max · Apr 21–May 1 → 8u · May 2+ → 10u
D8 — Max 7 plays per slate (CARD tab). ALL ANGLES lifts this cap for research.
P5 — ML -160 to -200 → 4u cap. Above -200 → skip (vig destroys ROI).
D3 — RL not recommended above -200 ML favorites.
ML hard cap — No play can exceed 78% true prob.
Totals hard cap — No total play can exceed 70% true prob.
ACTIVE PATTERNS — v7.1 (auto-updates from live model)
Loading patterns…
PATTERN TRACKER — AUTO-GRADING
Pattern W/L tallies update automatically when you grade plays in Log Results. No manual scoring.
When you use + CARD to add plays from the Analyzer, pattern IDs are stored with the play. When you set a result W/L, those patterns are scored instantly.
For legacy plays (added manually without pattern tags), the system parses your card notes for pattern ID mentions (e.g. "P4 HOU @ COL") and infers the tags on first grade.
Click ⟳ SYNC ALL on the Pattern Tracker page to rescan all historical plays and backfill any ungraded results.
New patterns added to the model are auto-registered in the tracker the first time they fire and get graded — no code change needed in the tracker.
DATA SOURCES
Schedule + Pitchers — MLB Stats API (official, free)
Team stats + streaks — MLB Stats API yearByYear endpoint
Bullpen arms + pitch counts — MLB roster + gamelog API (cached 2h)
Pitcher splits (home/away ERA) — homeAndAway stats endpoint (cached 48h)
Weather — Open-Meteo API, fetched at game time
Park factors — Built-in DB, all 30 stadiums (elev, roof, HR adj)
Line moves — Manual dropdown (Neutral / Sharp / Trap etc.)
Odds — Manual input from your sportsbook
IMPORTANT NOTES
🔴 Breaking news — This model catches systematic edges. Always cross-reference for last-minute scratches, lineup injuries, or weather delays before betting.
🔴 Early season (Mar–Apr 20) — D4 cap limits totals to 6u. Small sample sizes mean pitcher stats are unreliable; model applies April timing lag (-2.5% hit trajectory).
🟡 Pattern tracker calibration — The model's historical confidence ratings (S/A/B) come from pre-season research. Your live W/L data in the tracker is what actually matters. Divergence between the two is a calibration signal.
🟡 MODEL_WEIGHTS — All 51 factor values are defined in one object at the top of the source file. Adjustments based on your live data are one-line changes. Don't adjust until you have 50+ graded plays per pattern.
🟢 ALL ANGLES tab — Use this for research. Log lower-confidence plays at 1-2u to build sample size faster. The D8 cap only applies to the CARD tab.
v7.1 MODEL — NEW SIGNALS
📋 Lineup Cards
Individual batter OPS/ISO/K% fetched each game. Lineup OPS delta vs team avg adjusts run projections and ML edge.
📊 Series W/L Context
Series record tracked from recent game log. Must-win (trailing in finale), complacency risk (up 2-0), rubber match signals.
✈️ Travel Fatigue
Timezone delta computed per team. Away team crossing 2+ zones west gets -2-2.5% ML penalty, fewer runs projected.
⚖️ Real Umpire Stats
Actual historical over% and K-rate per HP umpire replaces wide/tight string classification. Fetched automatically.
🏃 Individual Batter Power
ISO, HR/game, BB% per team wired into run projections and totals. Power lineups at hitter parks amplified.
🌧️ Rain Probability
precipChance now in all four bet types. 40%+ rain suppresses scoring across ML, RL, totals, and F5.
📈 FIP / HR9 Pitching
Team FIP (luck-neutral ERA) and HR/9 added to run projections. FIP regression fires when 0.5+ gap from ERA.
🔄 Opening Lines
Historical snapshot fetched at open. True line movement (implied % delta) drives magnitude signals, not just move direction.
⚔️ Career BvP Matchups
Real career AB data for each batter vs today's starter. Hot (OPS≥.900 or 2+HR in 5+AB) and cold (OPS≤.500) matchups wire directly into ML, totals, and F5 edge — not just run projections.
🧠 Synthetic BvP Fallback
When career AB data is sparse (<5 AB), a profile-based matchup score is computed from batter K%/BB%/ISO vs pitcher K/9/BB/9/HR9 + platoon handedness. Shown as "Profile" in matchup cards, weighted at 0.6× confidence vs career data.

PATTERN TRIGGERS — all patterns including PASS bets

Run Board Analyzer first, then switch to this tab.

PROP MODEL CALIBRATION

Live weights + per-signal performance · sample-gated validation · formulas for each prop type

▸

Major Themes · r195 → r276

Curated thematic summary of work between r195 and r276 — Marcel-blend infrastructure, floor gates, calibration analysis, doubleheader plumbing. Sections default to collapsed (click a heading or use the search box to expand). For the full ship history since v5.5 (367 entries, including everything r277 onward), see the Roadmap sub-tab above.

Marcel formulas (r244)

r244 · Game-level Marcel snapshot infrastructure — validation-first, NO live behavior change. Pure capture ship to enable backtest validation BEFORE flipping any live Marcel-related changes. Every game bet now carries gameStatsAtAnalysis stamped at recommendation time: season + L15 + 3-component-blended RPG/OPS + RA9 for both teams, sample sizes (gpSeason / gpL15), raw blend weights actually used (currWt/priorWt/lgWt), trajectory labels (consistent_strong, trending_better, etc.), and prior-year-used flag. ~30 fields, ~600-800 bytes per snapshot. Threaded through all three card-add paths (sendBetToCard, addAllToCard, addBestBetToCard). Why infrastructure first: live analyzeGame already uses rpgBlend (the 3-component blend) for game total/F5 prediction since long before this ship — the actual gap was that game bets didn't capture the inputs that fed those predictions, so backtesting was impossible. Pre-r244 entries lack the snapshot and are silently skipped by the diagnostic harness. Also ships window.runGameMarcelDiagnostic() — segments graded post-r244 game bets by Marcel-shift magnitude (max(|awayRpgBlend−awayRpgSeason|, |homeRpgBlend−homeRpgSeason|)) into four cohorts: minimal (<0.10), moderate (0.1-0.3), strong (0.3-0.5), extreme (≥0.5). Compares high-shift vs minimal-shift WR + ROI. Inline verdict: ✅ if high-shift outperforms by ≥3pp WR or ≥5pp ROI (Marcel adds signal — keep weights, fine-tune in r245); ❌ if high-shift underperforms (Marcel hurts — r245 reduces/reverts); ⚠️ if similar (inconclusive); ⏳ if <30 bets per cohort (wait). r245's decision tree fully driven by this output. Run after ~30 days of post-r244 graded game bets accumulate.

Fade Away Dog pattern (r276)

r276 · Fade Away Dog +100-149 — paired-direction signal, default ON. Companion ship to r275. Where r275 stops us from betting the losing side (away dog), r276 captures the empirical inverse: the same 23 games where the away dog lost 70% of the time mean the HOME favorite WON 70% of the time. Edge-boost pattern fires when iterating side === 'home' AND the opposite team's ML odds (oppOdds) are in the +100 to +149 band. Adds +0.06 to the home-side probability (6pp model edge boost) and pushes the "fade_away_dog" tag into firedPatterns for backtest segmentation. Calibration: observed 69.6% home WR is ~12pp above typical -130 breakeven (57.4%), so 6pp boost is half the empirical edge — matches the standard discount-for-sample-size convention applied to other patterns. The boost combines with the model's existing edge derivation for home ML; on this cohort it typically promotes borderline plays to FIRE/MAX tier with appropriate sizing. Empirical replay: 23 cohort games saw home win 16-7. At flat 1u sizing on home ML at average -130 odds: ~+5u net. At elevated tier sizing (3u-5u), expected gain ~+15-25u. Combined with r275's -50.23u suppression, the paired r275+r276 ships convert what was a -50.23u bleed into a positive-EV signal on the same 23 games. Console kill switch: window._disableFadeAwayDog = true. Why this pattern matters beyond bankroll: it's the first explicitly empirically-derived "fade the model" pattern. Surfaces a new investigative approach — when a cohort's WR is significantly below 50%, the inverse direction may be a positive-EV signal. r277+ candidates: scan all bet types for similar inverted cohorts (e.g., are there Total Over scenarios where the model's PASS-tier under is the actual play?), per-pattern home/away asymmetry regression once 50+ post-r276 plays accumulate.

Away ML floor gate (r275)

r275 · Away ML +100 to +149 floor gate — backtest-validated, default ON. First game-bet floor gate (prior gates were prop-side: r265 Outs, r267 K, r269 HR/TB). Surfaced by r275 audit of 371 graded game plays in r257 backup: away ML at small-dog odds (+100 to +149) lost 30.4% WR / -44.8% ROI / -50.23u sized over 23 plays — EVERY tier was unprofitable (FIRE 1/5 at -15.4u, MAX 0/3 at -21.5u, PLAY 4/9, SOLID 2/6). Same odds at HOME side were profitable (50%+ WR / +10-74% ROI), confirming this is specifically an away-side miscalibration of the ML edge calculation, not a "small-dog ML is bad" general phenomenon. Heavy away dogs (≥+150) were already filtered to TRACK in pre-r275 logic; r275 completes the away-dog suppression with the sibling +100-149 band. Empirical lift on the 118-play ML cohort: -56.52u → -6.29u (+50.23u improvement, ~2.85× the next-largest single ship r265 Outs at +17.6u). Mechanism: matches existing heavy-dog filter pattern — pushes a TRACK rec (units:0, tier:'MARG', visible in analyzer output marked SUPPRESSED) so data continues accumulating for re-evaluation, but no bankroll impact. Filter triggers when side === 'away' && mlOdds >= 100 && mlOdds < 150. Console kill switch: window._awayMlAllowDog = true reverts immediately. What this implies about the model: the ML edge calc treats away vs home symmetrically but historical reality has an asymmetric "home-field" effect bigger than the model accounts for. The away dog band is where the asymmetry shows up most sharply (book is already pricing the away team to lose; if the model thinks it has edge there, the model is overconfident). Future ship: r276+ regression analysis on home/away asymmetry across all bet types could surface whether ML at neutral odds (~-110 to +100) also has a smaller version of the same effect, or if it's only the dog band.

HR + TB floor gates (r269)

r269 · HR + TB floor gates — architectural consistency, default ON. Completes the prop-floor gate sequence started by r265 (Outs) and r267 (K). HR analyzer now mirrors the K pattern: tier=PASS plays are not auto-tracked or shown as recommended. Empirical sample for HR is currently too small for a backtest verdict (only 6 graded HR plays in r257 backup) so the gate ships as architectural insurance — when HR data accumulates over the next ~30 days, the gate prevents the same floor-bleed pattern that hit Outs (-12.59u over 37 plays) and K (-2.95u over 23). TB analyzer adds a floor gate layered on top of the existing r228 suppression: while suppression is active (default), tier is already forced to PASS in the suppression block so the new gate evaluates false and recommended is false anyway — no behavior change. When TB suppression is eventually lifted (post ~30 days of post-r228+r241 NegBin data), the floor gate stays in place so micro-edge plays do not flood back into recommendations. Mechanism mirrors r267: _hrRecAllowed = _hrRecAllowPass || tier !== 'PASS'; same shape for TB with extra suppression-AND clause. Console kill switches: window._hrRecAllowPass = true and window._tbRecAllowPass = true revert respective gates immediately. With r269 all four prop analyzers (Outs / K / HR / TB) now have consistent floor behavior — tier=PASS plays universally produce recommended:false. Future ships transition to data-accumulation phase: r270+ candidates are TB suppression review (post-r228+r241 cohort), HR floor empirical validation (post-r269 cohort, ~30 days), bullpen per-weight ROI harness (handoff r261, fully unblocked).

K floor gate (r267)

r267 · K props floor gate — backtest-validated, default ON. Same pattern as r265 (Outs floor) but for K props. Backtest of 23 graded PASS-tier K plays in r257 backup: 47.8% WR, -12.8% ROI, -2.95u total — bleeding the otherwise break-even K cohort. Direction split is striking but on small samples: PASS Over (n=13) at 15.4% WR / -8.76u, PASS Under (n=10) at 90% WR / +5.81u. The asymmetry suggests model Over-lean bias on micro-edges, but neither sample is large enough to ship a direction-aware fix (95% CIs span 2-45% and 56-99%). Conservative gate filters all PASS; counterfactual lift on the same 23-play cohort is +2.95u (forfeits the +5.81u PASS Under wins which are likely partly noise). Mechanism: hoists tier to a local _kTier for both bet objects, then derives _kRecAllowed = _kTier !== 'PASS'. recommended is gated on _kRecAllowed && (direction === 'Over'|'Under'). Because _propUnitsFor already returns 0u for prob-edge <2pp and count-edge <0.5, and tier is already PASS for those edges, the only behavior change is recommended:false (true triple-strip in spirit). Console kill switch: window._kRecAllowPass = true reverts to pre-r267 behavior immediately without reload (gate consults window. on each call). Future ships: r268+ candidates: TB suppression review (needs ~30 days post-r228+r241 data; r257 backup pre-dates that window so 0 post-r228 plays available); HR analyzer audit (only 6 graded HR plays in backup, sample too small for diagnostic); generic prop tier=PASS rule unification once all four prop analyzers have empirical floor data.

Outs threshold tightening (r265)

r265 · Outs threshold tightening — backtest-validated, default ON. Per r242 the live Outs model lost money on both sides (Overs 46.5% / Unders 43.9% WR, both below 52.4% break-even). r264 ran runOutsDiagnostic() which surfaced two distinct broken cohorts via segmentation rather than a single uniform inversion. Floor: 37 of 113 graded Outs plays sit at |edge| < 1.0 with 35.1% WR, -34.0% ROI, -12.59u total — by far the dominant cause of the bleeding. Bucket WR is 12pp below the 47.6% no-skill floor so noise alone cannot explain it. The model emits direction (and recommended:true) for any non-zero edge, so micro-edges below noise floated through to user recommendations and to auto-tracking. ACE cap: 5 of 5 plays where pq='ace' AND |edge| ≥ 2.0 lost (0% WR, -100% ROI). Smaller sample but extreme outcome with theoretical justification — the model overadjusts ace projections (recent IP boost + OP3 wide-ump boost + base PQ default 6.3 IP can compound), and modern pitcher management (pitch counts, third-time-through avoidance, bullpen-first decisions) overrides those depth predictions for the very pitchers the model is most confident about. Cross-check: SOLID × |edge|≥2.0 × Over in the same backtest runs +18.9% ROI, so the issue is specifically with ACE pq class, not all high-edge predictions. Combined backtest: 71 of 113 plays survive (63% retention), 54.9% WR, +1.0% ROI, +0.71u total — flips the cohort from -16.88u to +0.71u (+17.6u swing on a 113-play sample). Mechanism mirrors r228 TB suppression: failed gate → recommended:false + tier:PASS + units:0 (clean three-field strip). Bet object still emitted so analyzer + props page can list it; snapshot still captured so calibration data continues to accumulate. Tunables: window._outsMinEdgeFloor (default 1.0; set 0 to disable floor), window._outsAceEdgeCap (default 2.0; set Infinity to disable cap). Setting both to no-op values reverts to pre-r265 behavior immediately without reload (gate consults window. on each call).

Distribution upgrade (r241 → r243)

r243 · NegBin-tail K model — backtest-validated, default ON. Replaced the count-difference K analyzer (which decided direction by expectedK − line) with a probability-tail framework using ptbNegBinSurvivalGE. Default VMR=2.5; tunable via window._kDispersionVMR; rollback via window._kUseNegBinTail = false. The empirical case: backtest on 107 graded K entries from r242-era data showed count-diff (current) +3.53% ROI vs NegBin VMR=2.5 +12.14% — a +9pp improvement. Critically, Poisson-tail alone produced −1.13%, so the win is specifically from NegBin's overdispersion correction (real K outcomes have variance > mean because pitchers have correlated good/bad days; Poisson can't capture this, NegBin can). Tier thresholds calibrated from per-bucket data: <2pp = junk (42.9% WR, −27.66% ROI) → PASS; 2-5pp = strong (68.4% WR, +30.55% ROI) → MARG; ≥5pp = strong (60.6% WR, +15.06% ROI) → PLAY. Unit sizing buckets in _propUnitsFor extended with parallel prob-edge scale. Snapshot fields added: kDistribution, kDispersionVmr, kNegBinN, kNegBinP, kModelOverProb, kEdgeProb, overFair. New edgeUnits field on bet objects disambiguates downstream (legacy count-edge consumers fall through correctly via p.edgeUnits === undefined check). No suppression layer — current K model is profitable, not inverted; r243 is a quality lift not a hedge against a known failure (unlike r228's TB suppression). Full live simulation against the backup data showed 90 bets surviving tier, 62.2% WR, +14.97% ROI per unit, +38.46u sized profit (vs +3.78u flat at current count-diff system).

r242 · runDistributionBacktest console utility — Poisson vs NegBin sweep across all four prop models. Companion to r241. Re-scores every graded propsTracking entry under candidate distributions {Poisson, NegBin VMR=1.3/1.5/1.8/2.0} and reports recommendations / wins / WR / ROI per (propType, distribution). Console-only utility (matches runBlendBacktest pattern). Methodology: derive expectedX from p.expectedValue (count mean), recompute modelOverProb under each candidate, derive actualOver from original W/L × original direction, then compute hypothetical newWin = (newDirection matches actualOver). ROI uses the actual stored book odds for whichever side the candidate distribution recommends. Outputs console.table with ★ on best-VMR row. Answers two empirical questions: (1) does NegBin recover TB agree-cohort to ≥47% — i.e., is r228 suppression now safe to lift; and (2) does NegBin help K/Outs/HR too — i.e., should r241 extend beyond TB? Usage: window.runDistributionBacktest() for the full sweep across all four props.

r241 · TB Poisson → Negative Binomial. The actual recalibration of the model that r228 suppressed. Replaced the inline poissonTailGE call inside evaluateBatterTB with ptbNegBinSurvivalGE (already used for live game totals since r104). Parameterization: VMR (variance-to-mean ratio) controls dispersion, default 1.5 from baseball-counts literature. Math: p = 1/VMR, n = expectedTB/(VMR-1) — preserves mean exactly. Pre-ship validation table comparing Poisson vs NegBin showed: at expectedTB=1.5 / line=1.5, Poisson tail = 0.4422 but NegBin(VMR=1.5) tail = 0.4074, NegBin(VMR=2.0) tail = 0.3813 — NegBin pulls 3-7pp of probability mass off the right tail at typical TB scenarios, exactly the inversion-correction mechanism the 35.7% WR cohort needed. Suppression remains default ON — empirical data drove the suppression, theoretical math drove the fix; we wait for ~30 days of post-r241 propsTracking before lifting. Tunable: window._tbDispersionVMR in console (range ~1.01-2.5). Override: window._tbModelSuppressed = false unlocks the model. Snapshot extended with tbDistribution / tbDispersionVmr / tbNegBinN / tbNegBinP for full reproducibility — pre-r241 entries treated as Poisson if backtest re-scores them.

CLV tracking (r240)

r240 · Closing Line Value tracking for prop bets. Mirrors the existing game-bet CLV pattern: openingOdds captured at track time (both manual and auto-track paths), closingOdds + closing book/Pinnacle pair stamped at autoGrade time via _capturePropClosingOdds helper. Resolves odds per propType from g.propOdds or g.batterPropOdds (matches by line for multi-line TB). Display: inline ↑/↓ CLV pp badge after odds in tracked-prop rows + Avg CLV stat tile in calibration summary card + per-type CLV breakdown. Why CLV matters: it's the leading indicator of edge — a 53% W/L model that beats CLV consistently is finding real price inefficiencies; a 60% W/L model bleeding CLV is just lucky. CLV gives a meaningful signal in ~30 days vs ~300 days for raw W/L. Capture is lazy-at-grade-time (matches game-bet shape), so it works as long as odds were refreshed at some point before game start; misses if user only opens the app post-game once API has dropped the market. Pre-r240 entries gracefully show "—" until they get re-graded.

Marcel-blend infrastructure (r229 → r239)

r239 · Marcel consistency cleanup. Two surgical fixes: (1) KP1 pattern (Power-K + high-K lineup) now fires on k9Season ≥ 11.0 instead of blended k9 ≥ 11.0 — aligns with the HR/TB convention where patterns key off stable season values rather than transient blended form. Pre-r239 a pitcher whose season K/9 was below 11 but whose blend pushed it over would fire KP1, double-counting recent form (the blend is already in the expectedK). (2) runBlendBacktest() gate reads minSamples/minDenom dynamically from window.BLEND_CONFIG with the same fallback chain marcelBlend itself uses. Pre-r239 the harness hardcoded the gate values, so if you mutated window.BLEND_CONFIG.batter_hr.minSamples = 50 in console the live model would honor it but the harness wouldn't. Now they stay in lockstep.

r238 · Marcel-blend HRPG on the HR analyzer. analyzeBatterHRBets now calls marcelBlend() for HRPG. r236 fetch extended to compute recentHrpg from the existing lastGames[].hr data (sum_HR / recentGames). Uses BLEND_CONFIG.batter_hr (default 40/60 recent/season — HR is rarer than TB so the default weights season more, 30-PA gate). Snapshot adds hrpgSeason, hrpgRecent, hrpgBlended, hrpgSource, recentPA, recentGames, plus weatherMult/pitcherMult/disciplineMult/overFair on batterStatsAtAnalysis for backtest reconstruction. runBlendBacktest() extended with a batter_hr branch — recomputes expectedHR through the full multiplier chain, then probHR = 1−exp(−expectedHR) compared to captured overFair (HR-specific direction because line is always 0.5 but probHR > 0.5 only when expectedHR > ln 2). Live behavior changes — unlike TB which is suppressed, HR is active, so blended HRPG flows into recommendations and tier assignments going forward. barrelPctRecent deferred — needs a separate Statcast Savant fetch path.

r237 · Marcel-blend SLG on the TB analyzer. analyzeBatterTBBets now calls marcelBlend() for SLG, mirroring the K analyzer pattern from r229/r233. Uses recent batter form from r236 (last 15 games SLG) blended with season SLG via BLEND_CONFIG.batter_total_bases (default 50/50, 30-PA gate). Snapshot adds slgSeason, slgRecent, slgBlended, slgSource, recentPA, recentGames on batterStatsAtAnalysis. runBlendBacktest() extended with a batter_total_bases branch — reads the new fields, recomputes expectedTB by multiplying blended SLG through the cached multiplier chain (park × pitcher × weather × BvP × xwOBA × hard-hit × GB) with the live floor=0.3, sweeps the same [0/20/40/60/80/100% recent] ratios. Live tier still PASS — TB suppression (r228) intact. The 35.7%-WR inversion is a Poisson tail problem, not a SLG problem; r237 lays the Marcel groundwork so the queued Poisson→NegBin fix slots in cleanly without snapshot migration.

r236 · Recent batter form fetch (gameLog hydration). Mirrors the pitcher lastStarts pattern for batters — same MLB Stats API endpoint, group=hitting instead of pitching, batched 50 IDs at a time. Fetches each batter's last 15 game splits (~57 PA at 3.8 PA/game, comfortably above the 30-PA gate in BLEND_CONFIG). Computes recentSlg from totalBases/AB sums (weighted by AB so a hot 6-AB game outweighs a quiet 2-AB one), recentHrpg from sum_HR / recentGames (added in r238), plus recentPA, recentAB, recentGames, and a lastGames array. Attached to enriched g.{away,home}BatterStats[] entries. Cache key bumped v1→v2. Consumed live by r237 (TB analyzer) and r238 (HR analyzer).

r233 · BLEND_CONFIG + marcelBlend() utility + runBlendBacktest() harness. Centralized blending utility for season ↔ recent-form weighting across every model. Default config: pitcher_k 60/40, pitcher_outs 60/40, batter_total_bases 50/50, batter_hr 40/60, team_rpg 40/60. Outs snapshot extended with k9Season+k9Recent+ipSeason+ipRecent+ipBlended for backtest. window.runBlendBacktest() sweeps blend ratios [0/20/40/60/80/100% recent], reports leaderboard sorted by hypothetical -110 ROI. Going-forward only — needs ~30 days of post-r233 graded plays for reliable signal.

r229 · Marcel-blend K/9 (60% recent / 40% season). Pre-r229 K model used pure season K/9. Backup analysis showed K is the strongest prop model (57.9% WR on agree cohort, n=79) but season-only weighting lags pitcher form swings. Sample gate: ≥3 starts AND ≥5 IP across recent. Snapshot captures k9Season+k9Recent+k9Blended+k9Source for calibration regression. r231 surfaced PROP MODEL CALIBRATION card on Model page (was hidden on Props page).

Calibration analysis & TB suppression (r228)

r228 · TB model suppression based on empirical analysis. Backup analysis on 305 propsTracking entries showed: P K agrees 57.9% (n=79, working ✓), P Outs agrees 46.2% (n=81, marginal), B TB agrees 35.7% (n=31) — INVERTED vs neutral 47.4%. Pattern is consistent with cross-sport literature on Poisson-tail player props (under-predicts variance, fires "agree on Over" too liberally, reality regresses to under). r228 forces TB tier=PASS, recommended:false, units=0 when window._tbModelSuppressed !== false (default true). Tracking continues so propsTracking accumulates calibration data. UI banner with research-mode toggle. Real fix queued: Poisson → Negative Binomial (r235+).

Analyzer governance (r222 → r223)

r222 · Structural opposite-side conflict resolver + same-game concentration cap. Pre-r222 the analyzer could show OVER 57.6% AND UNDER 57.2% on the same line — mathematically impossible. doTotal/doF5Over/doF5Under each computed probabilities independently from book-implied; nothing forced p(over) + p(under) ≤ 1. Fix: after all bet types compute, scan for OVER+UNDER and F5 OVER+F5 UNDER on same line, drop lower-edge side, tag survivor with structural-conflict factor. Plus same-game concentration cap: when 3+ aligned bets remain, scale units by edge rank (1× / 0.75× / 0.5× / floor 0.5u).

r223 · Three-layer F5 sanity check. Triggered by user finding PIT@AZ F5 line stamped 7.5 (full-game value) producing phantom +23% edge. Layer 1: tightened isPlausibleF5Line with ratio guard (n / full > 0.75 rejected — F5 must be < 75% of full). Layer 2: edge sanity gate flags total/F5 bets with edge ≥ 15%, adds warning factor + EDGE-SANITY-FLAG pattern tag. Layer 3: F5 warnings include ratio context (typical 0.50–0.60).

r225 · HR market parser fix. Pre-r225 only iterated softMap so when soft books hadn't posted HR but Pinnacle had, every batter dropped silently (diagnostic showed "0 with HR odds"). Fix iterates union of soft∪Pinnacle batter names, prefers soft when complete, falls back to Pinnacle. New primarySource field tags 'soft' or 'pinnacle' origin.

History & visibility (r217 → r227)

r227 · CRITICAL bug — renderHistory filter parity. Discovered that r212/r218/r221 filter work all landed in renderResults() (orphaned page) instead of renderHistory() (the page the user actually navigates to). Two completely separate History UIs existed. r227 ports tier/odds/divergence dropdowns + matchesFilter logic + search box to renderHistory. The "Divergence" filter (✓ Model agrees / ~ Neutral / ⚠ Model diverges) is finally accessible.

r226 · Prop prediction detail row in History. Expanded panel for prop bets now shows model-vs-book row. K/Outs/TB use count comparison ("predicted 6.8 K vs line 5.5 (+1.3)"); HR uses probability comparison ("model 15.3% HR prob vs 50%"). Tighter thresholds than game-bet divergence because prop signals are noisier.

r217–r219 · Model-divergence instrumentation + dashboard scope toggle. Every game-level bet stamped with predicted-runs-vs-book at write time (status: agree/neutral/diverges, predicted/book/diff/strong). New dashboard card: W/L/ROI by cohort. History filter+chip for divergence. Dashboard scope toggle (today/7d/30d/season) + since-last-visit banner.

Recovery tools & cleanup

r224 · cleanupTodaysResearch / restoreTodaysResearchBackup. Removes auto-tracked entries from patternResearch + oddsResearch for target date (defaults today). Preserves manual + Track entries. Backs up to localStorage ptb_todayResearch_backup_* before mutating. Dry-run by default; { confirm: true } to commit. Companion restoreTodaysResearchBackup() for rollback.

r221 · Dashboard cleanup. Metric strip 8 cells → 5 (P&L+Units folded, Record+WR folded, Last 7 Days replaced by scope toggle). DoW row card removed (heatmap kept). CLV Summary removed (CLV Tracker kept).

r220 · Test harness flake fix. Split run_all.sh into READ_ONLY_HARNESSES + MUTATING_HARNESSES batches. Eliminates race where qa_test/odds_api_url_test read polluted index.html during smoke's inject windows. 35s → 48s wall time but stable.

How the Model Works

Core: For each game the model runs four bet types (ML, RL, Total, F5) through independent probability engines. Each engine starts at 50% and applies weighted adjustments for pitcher quality, bullpen strength, hitting, park, weather, platoon, momentum, travel, and pattern signals. True prob minus implied prob = edge. Edge drives unit sizing.

MODEL_WEIGHTS: All 158 numeric factors live in a single MODEL_WEIGHTS object at the top of the source file. Changing one line calibrates the factor across every function. Adjust only after 50+ graded plays per factor — see Pattern Tracker for live hit rates.

Pattern Tracker: Live W-L records for all 41 patterns are tracked in Supabase and auto-update when you grade results in Log Results. Hit/miss counts shown in Model Notes below are model confidence ratings — for live season data go to Pattern Tracker.

ALL ANGLES: Run the Board Analyzer and check the ALL ANGLES tab — every positive-edge angle across all 15 games, no D8 cap. Use this to build sample size faster. Log angles at 1-2u for research even if not betting full size.

Pattern Confidence Tiers

S-TIER 80%+Full edge applied. Min 5u when firing. Historical data is deep and consistent. Bet aggressively.

A-TIER 65–79%Standard edge. Normal unit scale. Solid signal. Fire when conditions match.

B-TIER 50–64%Edge -25%. Max 4u sole anchor. Needs a confirmation factor — don't bet B-tier in isolation above 3u.

C-TIER <50%Monitor only. 2u max. No standalone plays above 2u. Use for parlay legs or research tracking.

Live hit rates are in Pattern Tracker — that's the source of truth for how each pattern is actually performing this season. The confidence ratings here are pre-season research baselines.

All 41 Patterns — v7.1 Firing Conditions

Hit/miss counts → Pattern Tracker page. Descriptions here = firing conditions only.

S-Tier — 80%+ · full edge · min 5u

P4 B-62%BACKEND vs BACKEND → OVER default. Both starters backend tier. Both typically exit inning 4-5, handing to bullpen early. Default over regardless of park or weather. Min edge 4%. Capped at +8% adjustment. Stacks at hitter parks. Fires in: detectPatterns → doTotal

P6 S-70%GB/Sinker arm → Under modifier. 55%+ career GB rate → -5% over / +5% under. Applies to both starters. Does NOT override in domes or at Coors — still applies. Current confirmed arms: Abbott, Kirby, Pallante, Soriano, Holmes. Fires in: detectPatterns → doTotal + doF5Under

P2 B-52%Under setup checklist — 2+ of 4 criteria. (1) ACE or K/9 10+ arm starting. (2) Dome or pitcher park. (3) Total in bottom 25% of slate. (4) Sub-50°F outdoor OR both teams cold offense L7. Score: 2=MARG 3u · 3=SOLID 5u · 4=FIRE 6u. Fires in: underChecklist → doTotal

A-Tier — 65–79% · standard edge · normal units

P1 A-78%BACKEND + hitter park + wind OUT → OVER. All THREE required: (1) BACKEND starter, (2) hitter-friendly park, (3) wind 10+ mph OUT or temp 80°F+. ACE or GB arm on either side negates. +5% over adjustment. Fires in: doTotal

P3 S-72%ACE vs ACE → UNDER default. Both starters ACE tier. Full 100% park+weather cap applied. 15% minimum edge required to bet OVER when P3 fires — never violate. +6% under adjustment. Fires in: detectPatterns → doTotal + doF5Under

S3 A-74%Wind 10+ mph IN → Under. 960-781 (55.1%) since 2005, +6.4% ROI. Wind suppresses HR, turns fly balls into warning-track outs. Does NOT fire in domes. Stacks with P2. +6% under adjustment. Fires in: detectPatterns → doTotal

S2 A-68%Divisional home dog +100 to +160. +9% ROI since 2005 (Sports Insights). Division familiarity levels field. Public overvalues visitor. Only fires up to +160. +4% true prob to home dog. Fires in: doML

D7 A-68%Series finale + rest advantage. Fires when: it is the series finale AND my team is rested (2+ days) AND not B2B. Effect: +2% edge for rested team. Does NOT fire if ACE starts for the favorite. Stacks with D6. Fires in: doML

D6 A-70%Sweep prevention dog. Fires when: team on L2+ losing streak enters the final game of a 3- or 4-game series AND is a dog at +1 to +180 AND does NOT have an ACE starting. +4% true prob to sweep-facing dog. Fires in: doML

B-Tier — 50–64% · edge -25% · max 4u sole anchor

P7 B-68%Hot team ML dog. Team 7-2+ in first 15 games, priced as dog +100 to +160. Books slow to update from preseason odds. +4% true prob. Requires OPS L15 confirmation — pure record without offensive support gets half credit (+2%). Fires in: doML

P5 B-62%Clean ML. Tier gap + lineup edge + ML -180 or better. No debut flags, no age caps firing. -160 preferred; -160 to -200 = 4u cap; above -200 = skip. Fires via factor stack in doML

P8 B-60%Cold + series finale → Under stack. Fires when: temp <50°F AND it is a series finale AND NOT both starters ACE AND NOT dome AND NOT Coors. Stacks D7 (-3%) + B10 cold (-10%). +5% under adjustment. Fires in: detectPatterns → doTotal

D9 C-35%DANG Under (day game after night game). Both teams played the night before, today is before 5pm ET. Hitter fatigue + pitching prep down. -3 to -4% total lean. Stacks with D7 for -5% max. Does NOT fire: ACE starting, Coors, dome. Fires in: detectPatterns → doTotal

D11 B-62%Letdown Spot. Series opener (game 1 of new series) + team on 3+ game win streak + opponent below .480 WR. After beating tough opponents, motivation drops against weaker teams. -2.5% on letdown team. Does NOT fire: ACE starter. Source: Action Network, Bet Labs situational research. Fires in: doML

D12 B-58%Hot Streak Regression. Team on 7+ game win streak as favorite ≤-130. Public piles on, books shade the line. -2% regression. Does NOT fire on dogs (contrarian value remains). Source: Bet Labs 2005-2023 — favs on 7+ W streak: -3.2% ROI. Fires in: doML

S16 B-60%Bounce-back Dog After Blowout. Team on 2+ game losing streak as dog +100 to +160 vs non-elite (<.580 WR) opponent. Public overreacts to blowout losses, creating contrarian value. +2.5% on bounce-back dog. Source: Bet Labs 2005-2023 — bounce-back dogs: 53% WR. Fires in: doML

S4 B-60%Wind 8+ mph OUT → Over lean. 1,174-1,045 (52.9%), +3.6% ROI since 2005. Weaker signal than S3. +4% over lean. Stacks with P1 for max over boost. Fires in: detectPatterns → doTotal

S5 B-62%Low total + quality starters → Under. Total ≤7.5 AND both starters ace/solid. Market pricing a pitcher's duel — under has positive expectation. +4% under lean. Stacks with P3, P2. Fires in: detectPatterns → doTotal

S6 C-47% MANUALFade trendy public over. When public betting over 65%+ and line hasn't moved — sharp money countering. Set line movement to "Public only" to activate. Does NOT fire automatically — requires manual note entry. Action Network data: 15-20 games under .500 when public loves an over at 66%+.

C-Tier · Monitor

S9 B-60%Divisional road dog + total ≥8.5 → ML lean. Action Network / Sports Insights: div road dogs +71.2u since 2005. Public overvalues home field, undervalues divisional familiarity for road teams. High total ≥8.5 = high variance = levels playing field. +3% ML boost. Range: dog +100 to +200. Fires in: doML

S10 B-58%Bad team (≤40% W%) after a WIN → ML dog value. Action Network Bet Labs: +12% ROI, 1300+ games since 2005. Public overcorrects after bad team wins — assumes they can't win back-to-back. Creates artificially inflated lines on opponent. Fires when team ≤40% W%, on win streak, and is a dog. +3% boost. Fires in: doML

S11 B-60%Both winning teams (≥51% W%) + total ≥8 → Contrarian Under. Action Network: under hits 55.1% in these spots, +190.18u since 2005. Public loves to bet over on marquee matchups between two good teams. Books shade total up, creating under value. +3% under boost when both teams ≥51% W% and total ≥8.0. Fires in: detectPatterns → doTotal

Dome Under B-52.7%Dome/closed roof → Under lean (+2%). Action Network: dome games under 52.7% since 2005, +68.56u. Perfect conditions = ball doesn't carry + no weather variance. Small direct boost applied to all dome under calculations. Fires in: weatherBlock → doTotal

MONITORS7 — Dog ML + High Total (10+) → Over lean. Sports Insights Bet Labs 8yr data: ROI increases as total rises when betting dogs. High-total games = high variance = upsets. Dog ML + Over naturally correlated. +3% over lean when total ≥10 and clear dog exists. Currently tagging only — no adj until 50+ plays. Fires in: doTotal

MONITORS8 — RL -1.5 gate: predicted total must be ≥8.0. Research: 28% of MLB games decided by exactly 1 run — RL loses even when ML wins. In low-scoring environments (pred total <8), one-run game probability too high. SportsBettingDime: avoid RL in tight/low-scoring matchups. Gate will suppress RL -1.5 recs when model predicts <8 total. Currently tagging only — no suppression until 50+ plays. Fires in: doRL

MONITORD10 — Same-game conflict gate: FAV ML/RL + Under suppression. Sportsbooks explicitly ban same-game ML+RL parlays (illegal correlated parlay rule). Our 3-day data: 3/3 FAV ML + Under combos ended in conflict (ML won, under blown up). Exception: ACE vs ACE with line ≥9.5. Gate will suppress Under when FAV ML or RL -1.5 also recommended on same game. Currently tagging only — no suppression until 50+ plays. Fires in: analyzeGame post-processing

P9 C-55%Pitch-count managed starter vs elite offense → Over lean. Fires when: starter has low confidence rating AND is solid/backend tier AND opposing offense has OPS ≥.760 or RPG ≥5.2. +3% over lean. Blocked at Coors. Fires in: detectPatterns → doTotal

P10 MONITORACE/GB arm at Coors → Under viable. Small sample — insufficient data for B-tier. Tracks only. Promote after 10+ data points.

Hard Cap Rules (always enforced, no exceptions)

D3 — RL gate: ML -200+ → skip RL entirely. -160 to -200 → 4u cap, 10%+ edge required. Research: favorites -150+ = -310u since 2005 despite 63% win rate — vig destroys ROI.

D4 — Tiered totals cap: Pre-Apr 20 = 6u max · Apr 21–May 1 = 8u max · May 2+ = 10u max. Cold weather + unproven rotations persist through late April in Northeast/Midwest.

D5 — Rain/postponement: 40%+ rain = conditional 3u max. 80%+ = skip. Don't lock until 2 hours before first pitch.

D8 — Quality filter: Max 7 plays per slate on the CARD tab. Never 3+ totals same direction. B/C tier cannot sole-anchor plays above 3u. Use ALL ANGLES tab to see remaining plays without the cap.

Age caps: Age 38-40 = -2u · Age 41+ = 3u HARD CAP non-negotiable.

ML hard cap: True prob capped at 78% ML · 70% totals. No play can exceed these regardless of model output.

Primary Model Factors (B-tier modifiers)

B2 — Pitcher tier: ACE / SOLID / BACKEND rated manually or via AUTO (uses ERA/WHIP/K9 from MLB API). AUTO sets SOLID as default when data is ambiguous.

B3 — Pitcher stats: Marcel blend = 60% recent / 40% prior year when ≥5 starts. FIP-ERA gap ≥0.7 = regression signal (±4-6%). K/9 ≥10 = K/9 boost. BB/9 ≥4.5 = command drag.

B4 — Rest days: B2B = -2% ML / -0.20 runs. Rested 2+ days = series finale boost eligible (D7). Travel: 2+ timezone crossing east = -2.5% for early games.

B7 — Bullpen: Elite BP ≤3.20 ERA/≤1.10 WHIP = +8% · Strong = +4% · Weak = -5% · Poor = -10%. Opp BP scaled at 60%. TAXED/WARM/FRESH arm status shown in analysis output.

B9 — Park factors: Hitter parks: GABP, CBP, Wrigley, Busch, Yankee Stadium, Coors (+large altitude bonus). Pitcher parks: Petco, Oracle, T-Mobile, Citi, Dodger Stadium. Full park DB with HR/over adjustments for all 30 stadiums.

B10 — Weather: Sub-45°F = -10% over · 45-49°F = -10% · 50-59°F = -5% · 80°F+ = +5% · 90°F+ = +10%. Wind 10-14mph OUT = +7% · 15+mph OUT = +12% · 10-14mph IN = -7% · 15+mph IN = -12%. Cold + wind IN stack fully. Oracle Park wind = ignore (architecture neutralizes). Dome = no factor.

B11 — Umpire zone: Wide = -4% totals/favors pitcher · Tight = +4% favors offense · Neutral = 0%.

B12 — Line movement: Now fully automatic via Pinnacle data. Signals and weights:
• RLM (Pinnacle 8¢+ tighter than DK/FD): +5% · Steam (Pinnacle moved 10+ pts): +7%
• Move 5–9 pts: +2% · Move 10–19 pts: +4% · Move 20–29 pts: +7% (Sports Insights 57.3%) · Move 30+ pts: +11% (61% win rate)
• Fav → dog flip: +13% on new dog (Bet Labs 63%) · Dog → fav flip: +10% if Pinnacle confirms, −5% if public steam
• Total crosses key number (9/8.5/8/7.5/7/6.5): +5% toward benefiting side
• S6 proxy: line held + Pinnacle under better than DK: −4% over. All signals auto-detected — no manual input needed.

Precision Factors (C-tier)

C2 — Run differential: Run diff +20 in L15 = +3% (genuine quality signal). Run diff -20 = -3%. Early W-L records dominated by 1-run game luck — unreliable before 30 games.

C5 — xERA/FIP regression: ERA-FIP gap ≥1.0 = ±4-6% correction. Never bet against strong positive regression. Example: starter ERA 6.75 vs xERA 2.27 = market pricing luck, bet the regression.

C6 — Early-season lag (Mar-Apr 20): Hitter trajectory discounted -2.5%. Small samples on offense — model applies April timing adjustment automatically.

Platoon advantage: Lineup handedness vs pitcher hand. Strong platoon (≥40% OPS delta) = +1.5% ML · Mild (≥25%) = +0.8%.

Streak context: Cold streak 6+ games = -1.5% (not applied below 6 games — small sample noise). Hot home streak = no bump (public trap).

Edge → Units Scale

Adjustments: B-tier sole anchor = -1u, cap 4u. C-tier = 2u max. Age 41+ = 3u hard cap. Debut + negative record = 3u hard cap. D4 totals cap applies on top of edge sizing. D7 series finale = -1u from final size.

Calibration Log

Record weight adjustments, confirmed pattern hits, new GB arms, rule updates. These notes persist in Supabase.

All Patterns — Current Confidence Ratings

S-Tier — 80%+ · full edge · min 5u

B-62%Pattern 4 — BACKEND vs BACKEND → OVER default. 6/6 hits — 100%. Zero misses. Both xERA 4.5+. Both exit by inning 4-5. Bullpen chaos. Default over regardless of park or weather. Min edge 4%. Min units 5u. Stack +3% at hitter park (GABP, CBP, Coors). Confirmed: CWS/MIA (13R), PIT/CIN (11R), MIA/CWS (11R), MIL/KC ×3 (10R, 13R, 13R). ↳ recalibrated B-62% from observed S-92% — small early-season sample.

S-70%Pattern 6 — Sinker/GB suppression → under modifier. 5/5 — 100%. 55%+ career GB rate → -5% over / +5% under. Stacks on tier. Applies even at Coors (Sanchez confirmed — Coors went under both games). Confirmed arms: Holmes (60%+), Abbott (58%), Kirby (55%+), Pallante (67%), Soriano (MLB GB leader), Sanchez (PHI). ↳ recalibrated S-70% from observed S-88% — sample stabilization.

B-52%Pattern 2 — Under setup checklist. 5/6 — 83%. Confirm 2+ of 4: (1) ACE or K/9 10+ arm, (2) dome or pitcher park, (3) total in bottom 25% of slate, (4) sub-50°F outdoor OR both teams cold offense L7. Score: 1=PASS · 2=MARG 2.5u · 3=SOLID 3.5u · 4=PLAY+ 4.5u. ↳ recalibrated B-52% from observed S-82% — actual WR ~50%.

A-Tier — 65–79% · standard edge · normal units

A-78%Pattern 1 — Over setup (BACKEND + hitter park + wind). 3/4 — 75%. All THREE required: (1) BACKEND arm, (2) hitter-friendly park, (3) wind 10+ mph OUT or 80°F+. ACE/GB arm can negate. Confirmed: WSH/PHI (15R), CWS/MIA (13R), LAD/WSH ×2 (15R, 14R).

S-90%Pattern 3 — ACE vs ACE → UNDER default. Both K/9 10+. Full 100% park+weather cap. 15% min edge for over. NEVER play over below 15% edge. Biggest loss ($400) came from violating this rule. ↳ promoted to S-90% from A-72% — high-conviction signal.

A-70%D6 — Sweep prevention. 2/3 — 67%. Team facing sweep (0-2 in series), priced as dog +100+. +3% true prob to sweep-facing team. Best use: fade sweeping team at -200+ when true odds ~-110. Confirmed: ATH beat HOU +240, MIA beat NYY.

A-68%D7 — Series finale + rest (the Sunday rule). All 3 must fire: (1) series finale or Sunday/Monday fav playing 3+ straight, (2) fav -150+ with resting regular or no off-day, (3) national TV or marquee team. Effect: -3% fav edge, -1u, +2% dog. Does NOT fire if ACE starts for fav. Stacks with D6 — max +5% combined dog bump.

B-Tier — 50–64% · edge -25% · unit cap -1u · needs confirmation

B-58%Pattern 7 — Hot team ML dog. 3/3 (small sample). Team 7-2+ in games 1-15, priced as dog +100 to +160. Books still on preseason pricing. +4% to true prob. Requires clean matchup. Confirmed: STL swept DET, MIL beat BOS, PIT beat BAL.

B-75%Pattern 5 — Clean ML. 4/7 — 57%. Tier gap + lineup edge + -180 or better. No debut flags, no age caps. Losses from ignoring D-tier, not the pattern itself. ↳ recalibrated B-75% from B-62% — strongest B-tier signal.

B-75%Pattern 8 — Cold + series finale → under stack. 1/1. D7 fires AND sub-50°F outdoor: primary play is UNDER. Stack D7 (-3%) + B10 cold (-10% to -14%). Confirmed: CWS/TOR cold series finale = 3 total runs. ↳ recalibrated B-75% from B-60% — stack effect verified.

Sharp Patterns — statistically backed, season-long ROI data

A-74%S3 — Wind 10+ mph IN → Under. 960-781 record (55.1%) since 2005, +6.4% ROI (Action Network Bet Labs). Wind turning HR into warning-track outs. Does NOT fire in domes. Stacks with P2 under checklist. Best at: Wrigley, Fenway, Cleveland.

A-68%S2 — Divisional home dog (+100 to +160). +9% ROI in home RLM system since 2005 (Sports Insights). Div familiarity levels field — public overvalues visitor. Only fires for dogs up to +160. Best value early season when books are slow to adjust. Stacks with D7 series finale.

C-35%D9 — DANG Under (day game after night game). Both teams played night game prior day, today's game before 5pm ET. Hitter fatigue, pitching prep down. -3 to -4% total lean. Stacks with D7 series finale for max -5%. Does NOT fire: ACE starting, Coors, dome. ↳ recalibrated C-35% from B-65% — signal underperformed once n grew.

B-62%S5 — Low total (≤7.5) + quality starters → Under. When books set a low total AND both starters are solid/ace quality, the market is pricing in a pitcher's duel. Under has positive expectation. Stacks with P3, P2.

B-60%S4 — Wind 8+ mph OUT → Over lean. 1,174-1,045 (52.9%), +3.6% ROI since 2005. HR booster angle. Weaker signal than S3. Stacks with P1 (BACKEND + hitter park + wind OUT) for max over boost.

C-47%S6 — Fade trendy public over. When public heavily betting over (65%+) and total hasn't moved up — books holding line = sharp under money countering public. Select line movement dropdown to "Public only" to activate. Action Network: 15-20 games under .500 when public loves an over at 66%+. ↳ recalibrated C-47% — manual-fire only (status:manual in PATTERN_BASE).

C-Tier — monitor only · 1u parlay legs · no standalone

C-55%Pattern 9 — Post-surgery ACE + elite offense → over lean. Starts 2-5 post-TJ/surgery, facing wRC+ 108+. 5 IP max effective. +3% over. Model IP at 5 max. Active arms: McClanahan, Woodruff.

B-55% ⏳Pattern 10 — ACE/GB at Coors → under viable. Strong groundball arm starting at Coors flips the default Coors-Over assumption. +2% to Under. Best stacked with P3 (ACE vs ACE) or P6 (Sinker/GB modifier). ↳ promoted from MONITOR to B-55% lowData — recalibrate after n≥30.

A-Tier — Sharp / Steam (added)

A-70%S15 — RLM + Steam: both Pinnacle signals same ML side. When Pinnacle's reverse-line-movement detector AND its steam detector both fire on the same ML side, that's a stack of two independent sharp signals. +3-4% to that side. Strongest A-tier confirmation pair we have. Skip if total moved 0.5+ in opposite direction (mixed signals).

A-68%S14 — Correlated steam: ML + total same direction. When ML steam and total steam point the same way (sharp money on fav AND sharp money on Over, or sharp money on dog AND sharp money on Under), the correlation matters more than either alone. +2.5% confirmed when both fire.

B-Tier — Modifiers & sharp signals (added)

B-63%S13 — Pinnacle total steam (0.5+ pts from open). Pinnacle's total moved 0.5+ from opening despite no major news (no scratched starter, no weather) → sharp money has rebalanced the line. Side that Pinnacle moved TOWARD gets +2% confidence. Reset on lineup announcements.

B-62%D11 — Letdown spot (series opener after streak). Hot team (3+ W streak) opening a new series vs a weak opponent. Public overvalues the streaked side. -2.5% to streaked fav. Stacks with D12 (hot streak regression).

B-61%B8 — Both bullpens ERA 5.0+ → Over lean. When both teams' bullpen ERAs (last 30d) are 5.0+, late-inning collapse risk is doubled on each side. +2% to total. Stacks with P4 backend-vs-backend.

B-60%S9 — Divisional road dog + total ≥8.5. +71.2u since 2005 (Action Network). Div road dogs win more often than market prices when totals are ≥8.5 — high-scoring environment levels variance. +3% to dog ML.

B-60%S11 — Both winning teams + total ≥8 → Contrarian Under. +190u since 2005 (Action Network). When both teams are winning programs and the total is ≥8, the public stacks the over and books shade it; under has +EV. -2.5% to total.

B-60%S16 — Bounce-back dog after blowout loss. Team coming off a 7+ run loss, priced as +100/+160 dog vs non-elite. 53% WR (Bet Labs). +2.5% to dog ML.

B-58%S10 — Bad team (≤40% W%) after a win. +12% ROI across 1300+ games (Action Network). Sub-.400 team that just won, priced as a dog. Books slow to adjust on bad teams. +3% to ML.

B-58%D12 — Hot streak regression (7+ W streak fav ≤-130). -3.2% ROI on heavy favs riding 7+ game streaks (Bet Labs). Mean reversion priced wrong. -2% to fav.

Low-Data Research Patterns — calibration values are placeholders (recalibrate after n≥30 each)

A-66% ⏳S18 — Extreme line movement (≥7%). Line moved 7%+ in implied prob since open. Documented WR ~66% in published research, but our own n is small. Direction depends on whether the move is sharp-driven or steam-chasing. LOW-DATA.

B-60% ⏳S19 — Cross-market total divergence (Pinnacle vs DK ≥3¢). Pinnacle's total juice differs from DK's by 3+ cents with no temporal steam — implies the books disagree on a lean. Side Pinnacle juices toward gets +2%. LOW-DATA.

B-60% ⏳S8 — RL -1.5 gate: only when pred total ≥8. Heavy fav -1.5 RL is only profitable in higher-total environments where the run cushion is reachable. Below predicted total of 8, RL -1.5 underperforms vs ML. Suppresses RL -1.5 plays when total <8. LOW-DATA.

B-60% ⏳D13 — F5 early-season edge (Apr–May 1). When both starters are quality and we're pre-May 1, F5 Under has +2% lean. Cold weather + unproven bullpens haven't entered yet. LOW-DATA.

B-60% ⏳P11 — Bullpen meltdown: opp taxed HL + capable offense → Over. Opponent's high-leverage relievers are taxed (multiple appearances last 3 days) AND our team's offense ranks top-half wRC+. +2% to Over. LOW-DATA.

B-60% ⏳P12 — Both teams B2B → bullpens taxed → Over lean. Both teams played yesterday (no off-day for either) → expected bullpen usage ≥3 IP each, increasing late-inning runs. +2% to Over (r300: doc synced with code — r290 cut to ±1%, r294 reverted to ±2%, doc was never updated). LOW-DATA.

B-58% ⏳S17 — Combined lineup K% vs MLB avg. Both lineups in top quartile of K% → -1.5% to total (Under lean). Both in bottom quartile → +1.5% to total (Over lean). Pitcher-aware multiplier. LOW-DATA.

B-58% ⏳S7 — Dog ML + High Total (10+) → Over lean. Underdog ML + total ≥10 → high-variance environment favors Over (dog more likely to score in shootout). +1.5% to Over. LOW-DATA.

B-56% ⏳D10 — Same-game conflict: FAV ML/RL + suppress Under. When FAV ML or RL fires AND a separate Under fires on the same game, the Under is structurally opposed (fav covering implies higher total, generally). Suppresses Under -1.5%. Tag-only at low data — will harden once 50+ conflicts confirm. LOW-DATA.

B-55% ⏳D14 — April early-season — short starter outings → Over lean. Pre-May 1, starters average 5.0 IP instead of 5.7 (cold weather + unproven), pushing more innings to bullpens. +1.5% to Over. LOW-DATA.

r117 Doubleheader Patterns — research mode (Game 1 loser bouncing back is intentionally NOT included — coin flip)

C-55% ⏳DH1 — Game 2 of doubleheader → Over lean. G1 bullpens taxed → expect more runs in G2 once starters exit. Placeholder +0.5pp on Over and F5 Over. PLACEHOLDER — recalibrate after n≥30.

C-54% ⏳DH2 — Game 2 of doubleheader → home ML fade. Home edge documented to diminish in G2 (both teams played the same earlier game; routine effects shrink). Placeholder -0.5pp on home ML. PLACEHOLDER — recalibrate after n≥30.

C-55% ⏳DH3 — Game 2 of doubleheader → +1.5 dog value. RL -1.5 less reliable in G2 (similar logic — the home edge that powers a 2-run win is weaker). Placeholder -0.5pp on RL -1.5. PLACEHOLDER — recalibrate after n≥30.

Tier A — Pattern Recognition

A1 · Pattern-first: Match to highest-confidence pattern first. Never play against your own pattern signal.

A2 · Dominant starter override: Single K/9 10+ → cap park+weather 50%. Both K/9 10+ → 100% cap. ACE vs ACE = 15% min edge for over.

A3 · Sinker/GB modifier [S-88%]: 55%+ GB → -5% over / +5% under. Stacks always. Even at Coors. Confirmed: Holmes, Abbott, Kirby, Pallante, Soriano, Sanchez PHI.

A4 · Start-2 regression: Power-K (K/9 11+) = HALF penalty — bounces back faster (Mize confirmed). Command/GB = full ACE→SOLID through start 3.

A5 · Pitch count managed: Post-surgery starts 2-5 = PITCH COUNT MANAGED. Model as 5 IP max effective. Bullpen carries 6-9. Current: McClanahan (post-TJ), Woodruff (building up).

Tier B — Primary Modifiers

B1: Implied odds → true prob. Edge = true − implied. Min 4% to consider.

B2: Pitcher tier: ACE / SOLID / BACKEND + Power-K / Sinker-GB / Command / Cutter.

B3: Pitcher form: L5 ERA, xERA, FIP, SIERA, WHIP. FIP-ERA gap 0.7+ = regression. BABIP outlier = luck flag.

B4: Rest: ≤3d = -4% · 4-5d = 0% · 6-7d = +2% · 8+ = -3% rust.

B5: Lineup wRC+: 70% weight games 1-15. Steamer projections. Top-3 order most important.

B6: Platoon splits: Elite wOBA vs hand = +5% · Good = +2% · Weak = -4%.

B7: Bullpen: ERA ≤3.50 = +5% · ERA 4.20+ = -7% · TAXED (15+ IP last 3d) = -5% · Closer on IL = -3%.

B8: Team form: Runs scored L7 AND runs allowed L7 split separately.

B9: Ballpark: Hitter = GABP, CBP, Wrigley, Busch, Yankee. Pitcher = Petco, Oracle, T-Mobile, Citi, Dodger. Coors = +20% over.

B10: 90°F+ = +10% · 80°F = +5% · 60-79°F = 0% · 50-59°F = -5% · Sub-50°F = -10% · Sub-45°F = -10% (capped, was -14% — research shows -14% overstated). Wind 10-14mph OUT = +7% · 15+mph OUT = +12% · 10-14mph IN = -7% · 15+mph IN = -12%. Cold + wind IN = stack fully. Oracle Park: architecture neutralizes wind — ignore Oracle readings. Dome/closed = no factor.

B11: Umpire zone: Wide = -4% · Tight = +4% · Neutral = 0%.

B12: Line movement: Sharp = +5% · Public only = -4% · None = 0%.

Tier C — Precision Factors

C1: BvP: 20+ PA for projection. Any negative career record vs opponent on debut = unit cap (D1).

C2: Replace record-based bumps with run differential. Run diff +20+ in first 15 games = +3% (genuine quality signal). Run diff -20+ = -3%. Early W-L records are dominated by 1-run game luck — unreliable before 30 games. Separately: 5+ hitters hot (.300+/OPS .900+) = +4-6%. 5+ cold = -4-6%.

C3: Pitcher type vs lineup: Power-K vs high-K = extra suppression. Soft-toss vs pull hitters = damage risk.

C4: F5 evaluation: Run on every strong starter mismatch. Manual grade required.

C5: xERA/xFIP regression: Gap 1.0+ = ±4-6%. Never bet against strong positive regression. Gilbert xERA 2.27 vs ERA 6.75 = market pricing ERA → SEA ML value.

C6: Early-season discount (games 1-15): Form 70% weight. 0-3/0-4 teams = +5% regression bump.

Tier D — Hard Override Rules

D1: Debut + pitch count: New team / return 6+ months / Japan-KBO = +15% variance. Negative career record vs opponent = 3u HARD CAP. Post-surgery starts 2-5 = PITCH COUNT MANAGED (5 IP max). Current: McClanahan, Woodruff.

D2: Age: 38-40 = -2u · 41+ = 3u HARD CAP non-negotiable. Active: Scherzer (TOR) 41+ · Sale (ATL) 38.

D3: RL threshold updated — ML -200+ skip entirely (aligned with P5 gate). ML -160 to -200 = 4u cap requires 10%+ edge. Research: favorites at -150+ = -310u since 2005 despite 63% win rate.

D4: Tiered totals cap — Pre-Apr 20: max 6u. Apr 21–May 1: max 8u. May 2+: full 10u. Extended from Apr 20 because early-season variance (cold weather, unproven rotations) persists through May in Northeast/Midwest parks.

D5: Postponement: 40%+ rain = conditional 3u max. 80%+ = skip. Don't lock until 2h before first pitch.

D6 [A-70%]: Sweep prevention: Game 3 of 3-game series. One team won all prior. Sweep-facing dog gets +3%, sweeping team -3% confidence. Best use: sweeping team at -200+ when true odds ~-110.

D7 [A-68%]: Series finale + rest: All 3 must fire: (1) series finale or Sunday/Monday fav playing 3+ straight, (2) fav -150+ with resting regular or no off-day, (3) national TV or marquee team. Effect: -3% fav, -1u, +2% dog. ACE starting for fav = rule does not fire.

D8: Quality filter: Max 7 plays/slate. Never 3+ totals same direction. B/C tier cannot sole-anchor plays above 3u.

Cohort Pattern Tags post-r276 · forward-tracking, not yet matured

These pattern tags are added to plays automatically by analyzers (r493 HR cohorts, r500 K cohorts). They differ from the v7.1 patterns above in two ways: (1) most were discovered by backward-fit cohort analysis on existing data, so their in-sample ROI overstates true forward performance — the CONFIRMED tag indicates the Wilson 95% CI lower bound cleared the break-even threshold at discovery, the TENTATIVE tag indicates direction is right but CI didn't clear, and WATCH tags are observation-only. (2) live W/L records accumulate via the cohort verdict dashboards on the Calibration sub-tab (HR Cohort Verdicts, TRACK Ships) — those panels are the source of truth, this card is the reference catalog. Each tag is visible in firedPatterns on every play and renders inline on the History table.

HR Cohorts (r493 — observation only, no behavior change)

Status: All 9 BHR cohorts are forward-tracking with verdict gates. None have reached n≥30 graded plays in their post-r493 window yet — verdicts show INSUFFICIENT (n<30) on the Calibration · HR Cohort Verdicts dashboard. Behavior change deferred until verdicts mature.
BHR1 Barrel · BHR2 xwOBA · BHR3 Hard-hit · BHR4 BvP-Mash · BHR5 Pull+Drag · BHR6 Strong-Drag · BHR7 Low-Pull · BHR8 Hot-Team · BHR9 Wind-Out
↳ See the live dashboard at Calibration · HR Cohort Verdicts for the latest n and W/L counts.

K Cohorts (r500 — live gates on bleeders)

CONFIRMEDKP-W2 LOW-K-LINEUP (K Under). Opposing lineup K-rate 0.20-0.23, non-ace pitcher. In-sample: n=18, 77.8% WR (CI 55-91%, clears 50%), +49.7% ROI. Action: bet normally. ↳ Strongest single finding in the r500 audit.

TENTATIVEKP-W1 ACE-CLEAN (K Over). Ace pitcher + model agrees with Pinnacle (disagree <10pp). In-sample: n=13, 76.9% WR, +46.3% ROI, but CI lower bound 50% — does not formally clear. Action: bet normally; revisit at n=30.

TENTATIVEKP-W3 ACE-EXTREME (K Over). Ace pitcher + model 20pp+ above Pinnacle. n=20, +38.0% ROI. Direction looks right but adjacent bucket (KP-B2) is opposite-signed — possible non-monotonic noise. Action: bet normally; flagged for re-evaluation.

CONFIRMED BLEEDKP-B4 UNDER-HOT. K Under, model 15pp+ above Pinnacle. n=11, 18.2% WR (CI upper bound 48%, clears 50% on bleeder side), -68.2% ROI. Tag only — 9 of 11 overlap KP-B3 so the SKIP gate fires via that route; adding KP-B4 as standalone gate adds <0.5u counterfactual lift.

SKIP GATEKP-B1 OVER-HOT-NONACE. K Over, non-ace + model 10pp+ above Pinnacle. n=50, 46% WR, -15.8% ROI. Tier forced to PASS, units=0. Largest single bleed cohort. ↳ Counterfactual saved -7.89u over n=50.

SKIP GATEKP-B2 OVER-ACE-MIDGAP. K Over, ace pitcher + disagree 10-20pp. n=20, 40% WR, -27.8% ROI. Sub-cohort hidden inside the otherwise-winning ACE cohort. Adjacent buckets (disagree <10pp and 20pp+) are profitable — non-monotonic, most likely candidate for forward retirement if the band doesn't hold up. ↳ Most aggressive of the three r500 gates.

SKIP GATEKP-B3 UNDER-MID-OPPK. K Under, opp lineup K-rate 0.23-0.25 (mid-band trap). n=30, 33.3% WR, -38.2% ROI. Mechanically the complement to KP-W2 (low-K → Under good; mid-K → Under bad).

WATCHKP-O1 UNDER-LOWK-ACE. K Under, low-K lineup, but ace pitcher (n=3 inside the KP-W2 winner). Suspected bleed — aces strike out low-K lineups too, so the K Under thesis weakens. Tag only, no gate.

Forward-track via Calibration · TRACK Ships Verdict Report. r500 SKIP gates can be reverted in console with: window._kRecAllowBleeders = true.

Edge → Units Scale

Confidence adjustments: S/A-tier = full scale. B-tier sole anchor = -1u, max 4u. C-tier = max 2u. Age 41+ = 3u max. Debut + neg record = 3u max. Totals Apr 20 = 6u max. D7 = -1u from sizing.

Calibration Log

Bet Type Breakdown

Pattern Performance

All time

▼

TierIDPatternRecordHit %Model ConfStatus

Rolling Bankroll Trajectory last 30 days · actual P&L from graded plays

Calibration History log of confidence/weight changes over time

🎯 r496 TB Routing — Live Cohort Performance forward verification, read-only

📊 Prop Model Calibration r495 — model probability vs actual hit rate, read-only

⬆⬇ HR Cohort Verdicts r493 — forward-validated W/L per cohort, no behavior change

⬆⬇ K Cohort Verdicts r511 — forward-validated W/L per K cohort

○ Outs Cohort Verdicts r511 — forward-validation for OP1-OP5 signals

○ Board Bet Cohort Verdicts r513 — forward-validation for BB-* game-bet signals

⚙ Bullpen Marcel A/B r246 vs old method — accumulates as graded plays land

🎯 Bullpen Counterfactual clean test — does the bullpen signal help predictions?

The A/B card above compares games where the bullpen label changed vs not — but that is confounded (closer games flip more often) and polluted (F5 bets flip a label the bullpen never touched). This scores the same bets with vs without the bullpen signal against actual outcomes, only on the bets where it moved the prediction. Use this, not the card above, to decide a revert. Console: runBullpenCounterfactual()

🚫 Skipped-Bet Forward-Test are the bets we decline actually -EV?

The analyzer declines bets it judges sub-threshold (PASS) or blocks by rule (ML<-200 vig-trap, RL gates) — but those left no record, so we could not check whether skipping was right. Each declined would-have is now logged and graded against the result, broken out by reason. Console: skippedBetsReport()

⚙ Game Blend Backtest L15 vs season weight sweep — RPG / RA9 team blend

⚙ 3-Comp Blend (HR · TB) recent × career sweep — validates r254/r255 weights

📊 Probability Calibration predicted prob vs actual WR — game bets only

🔍 Fade Opportunity Scan surfaces inverted cohorts (model WR < breakeven) for fade or floor-gate ships

Generalizes the r275/r276 discovery process. Walks all graded game bets and props, buckets across 7 dimensions (type×side×odds, type×tier, type×line-band, prop direction, bvp source × signal, etc.), and flags any cohort where: (1) sample ≥ minN, (2) actual WR is at least gapThresholdpp below market breakeven, and (3) total P&L is negative.

Cohorts already addressed by existing r-ship filters (r228 TB, r265 Outs, r267 K, r269 HR/TB, r275 Away ML) are auto-flagged so only NEW opportunities surface. Tier-monotonicity check at the end calls out bet types where higher tiers underperform lower ones — a strong signal of edge-calc miscalibration.

Output goes to the developer console. Default thresholds: minN=15, gapThreshold=5.

Open DevTools → Console (F12 / Cmd+Opt+I) to see the scan results. For non-default thresholds: runFadeOpportunityScan({ minN: 20, gapThreshold: 8 })

📊 TRACK Ships Verdict Report forward-test verdict for r275 / r279 / r281 / r282

Companion to the Fade Opportunity Scan. Reads oddsResearch (the auto-track sink that captures every analyzer recommendation, including TRACK plays with units=0) and reports pre-ship vs post-ship WR + sized P&L for each TRACK-shipping gate.

For each ship, splits the cohort at the ship date: pre-ship numbers should match the ship's empirical justification, post-ship numbers are the forward-test signal. Verdicts: ✅ CONFIRMED (still bleeding ≥5pp below BE), ⚠️ MIXED (near BE), ❌ REVERTED (at/above BE — was variance), ⏳ INSUFFICIENT (need more data).

Why this matters: the Fade Scan reads days[].plays (manually-added bets), but TRACK plays never get added to the user's card (units=0). Without this tool the forward-test data the gates are generating was invisible.

Open DevTools → Console (F12 / Cmd+Opt+I) to see verdicts. Custom threshold: runTrackVerdicts({ minPostN: 20 })

📊 Data Health slate-wide cache state · what's fresh vs cached vs fallback

Walks the current slate, aggregates per-cache freshness for each game (lineup, BvP, umpire, standings, bullpen, hitter roster), and surfaces any cache-write failures from the r300 counter. Use this when something feels off and you want to know what's stale before staking.

Same data as window.dataHealth() in console — renders to this card so it's reachable on mobile too.

💾 Browser storage: checking…

📉 Drift Detection N-day rolling window vs cumulative · catches degrading patterns + tiers

Compares pattern WR + tier WR/P&L in a rolling window against cumulative history. Flags significant drops (≥10pp WR or P&L flip positive → negative). Pairs with the r293 rolling-window methodology — together they make degrading signals visible before they cost real money.

Default window is 14 days. window.detectDrift({ windowDays: 30 }) aligns with the Audit-page 30d default.

API Configuration

API keys are stored locally on this device only.

The Odds API Key get free key ↗

Used to auto-fill odds on the board analyzer. Free tier: 500 requests/month.

Auto-fill F5 (first-5-innings) odds

Costs ~2 credits per game (30 credits for a 15-game slate). F5 period markets aren't available via the main odds endpoint — each game requires its own request. Off by default.

Fetch batter HR props (r134)

Adds the batter_home_runs market to prop fetches. Costs ~1 credit per game. Tracks all recommended HR predictions for calibration; UI displays top 15 highest-edge per slate. Defaults ON.

Fetch batter Total Bases lines (r158, lines-only)

Adds the batter_total_bases market to prop fetches. Costs ~1 credit per game (~15 credits for a 15-game slate). Lines-only mode — fetches and displays the lines but does not generate recommendations yet (the model is planned for a future build). Defaults ON. Disable to save credits if you don't reference TB lines.

🔬 TB research mode (r309)

Turns on TB recommendations so you can see the model's picks and their W/L. Units are force-zeroed — TB plays appear with their natural PLAY/MARG/PASS tier and direction, get auto-tracked to propsTracking with a _researchMode:true tag, but never enter staked P/L. Use this to evaluate whether the post-r241 NegBin model has recovered enough to lift the r228 production suppression. Defaults OFF (suppression on, current production behavior). Persists across sessions.

🎯 Sharp-EV gate for props (r411 · §6.4)

Only stakes a prop when the price you pay beats Pinnacle's no-vig fair line — the same vig-trap principle used for game ML. Markup-eaten recommendations get their stake zeroed (units 0) but stay recommended + tracked, so their W/L still flows in to validate the gate. Applies to K + Total Bases (true sharp price) and Outs (sharp line); HR is pass-through until it carries a Pinnacle reference. Defaults OFF. Persists across sessions.

Margin demanded vs sharp fair:

BVP filter for batter HR props (r137)

Only show batter HR props for players with ≥5 PA career history vs the starting pitcher. Reduces noise from low-sample matchups. BVP line displayed under each row (career H-AB, HR, OPS). Defaults OFF (r188) — the filter was suppressing most matchups under current data, blocking calibration. Re-enable later once we have evidence that filtered ROI is meaningfully better than open-pool ROI.

Diagnostics

Live view of API fetch history, team-abbr consistency, storage usage, and recent errors. If something looks off, hit "Copy JSON" at the bottom and share that snapshot.

Loading diagnostics…

Data Integrity

Tools to verify and repair pattern tracking data. Rebuild recalculates all pattern stats from scratch using deduped play data (one count per pattern per game).

Maintenance & Diagnostics

Surfaces helpers that previously needed DevTools console access. Output from any diagnostics run appears at the bottom of this card.

Storage

Loading storage status…

Calibration reviews

Loading calibration state…

Backfill date

Notes (optional)

Anchors the recalibration alerts window. Plays graded on or before this date are excluded from alert aggregates going forward. Use to bootstrap (first ever review) or to mark a major model-ship boundary retroactively.

Umpire stats DB

Loading ump_stats status…

Run monthly. New umps auto-populate via the analyzer; this keeps existing rows current. One-time setup: run ump-stats-migration.sql in Supabase SQL editor.

Diagnostics

Each runs the named helper and prints its console output below. All are read-only except fitParlayHaircut, which writes the fit ratio to localStorage.

Output

Backup & Restore

Download your season data as JSON, or restore from an earlier snapshot. Snapshots are stored on this device only (localStorage). Keep 10 most recent; older ones auto-delete.

Add New User

Username

Password

Role

Current Users

Console Tools

Recovery, diagnostics, and calibration utilities exposed on window.*. Open browser DevTools (Ctrl+Shift+I or Cmd+Option+I) and use the Console tab. All tools are dry-run by default where applicable — pass { confirm: true } to commit destructive operations.

Recovery & cleanup

cleanupTodaysResearch(opts) — Removes auto-tracked entries from patternResearch and oddsResearch for target date. Preserves manual + Track entries. Backs up to localStorage before mutating.
window.cleanupTodaysResearch() // dry run, today
window.cleanupTodaysResearch({ confirm: true }) // commit, today
window.cleanupTodaysResearch({ date: '2026-05-06', confirm: true })

restoreTodaysResearchBackup(key) — Restores a research backup created by cleanupTodaysResearch. Call without arg to list available backups.
window.restoreTodaysResearchBackup() // list backups
window.restoreTodaysResearchBackup('ptb_todayResearch_backup_…')

cleanupCorruptF5Lines(opts) / cleanupCorruptF5Research(opts) — Marks plays with implausibly high F5 lines (full-game value typed into F5 field) as result=P. Same backup-first/dry-run pattern. Companion restoreCorruptF5LinesBackup() / restoreCorruptF5ResearchBackup().

Calibration & backtesting (r233)

runBlendBacktest(opts) — Re-scores graded prop plays under multiple Marcel blend ratios. Default sweeps [0/20/40/60/80/100% recent]. Reports leaderboard sorted by hypothetical -110 ROI. Skips entries missing season+recent fields (pre-r229 data unusable). Wait ~30 days for ≥30 post-r233 graded plays for reliable signal.
window.runBlendBacktest() // both K and Outs
window.runBlendBacktest({ modelKey: 'pitcher_k' }) // K only
window.runBlendBacktest({ ratios: [0.5, 0.6, 0.7] }) // custom sweep

BLEND_CONFIG — Per-model blend ratios. Read or override at runtime; persists for the session only.
window.BLEND_CONFIG // inspect all 6 entries
window.BLEND_CONFIG.pitcher_k.recent = 0.7 // tune K to 70/30

marcelBlend(input) — Standalone blend utility used by every model. Input: { seasonValue, recentValue, modelKey, sampleSize, denomSize }. Returns { blended, source, weights, gateReason? }.
window.marcelBlend({ seasonValue: 9, recentValue: 12, modelKey: 'pitcher_k', sampleSize: 3, denomSize: 18 })

Model overrides (r228)

_tbModelSuppressed — TB recommendations are PASS-tiered by default due to inverted calibration data (35.7% WR on agree cohort). Toggle to false to re-enable for research/comparison.
window._tbModelSuppressed = false // re-enable TB
analyzeAll() // re-run analyzer
window._tbModelSuppressed = true // re-suppress

_tbResearchMode — On the Props page, TB section shows a calibration banner. Toggle this flag to expand the underlying TB grid in research view (units forced 0, no recommendations).
window._tbResearchMode = true // show TB grid

Dev Tooling

Off-app test harnesses and static audits that gate every ship. Lives in /tmp/ptb/ alongside index.html. Run from a dev shell — none execute in the browser. Test baseline as of r235: 5646 / 5646 passing across 9 harnesses + 12 audits. r236 through r244 (nine ships) added no test-harness assertions — all are app-side changes. r236-r244 should still measure 5646/5646 once you run the harness against this build, with one possible exception: any test asserting K's edge field shape as count-units could fail under r243+ since K's edge is now in prob-units when NegBin path is active (default). The new edgeUnits field disambiguates; tests reading edge alone would need updating. Worth adding fresh phases for CLV capture, Marcel snapshot field shape (prop + game), TB+K distribution params, and runDistributionBacktest / runGameMarcelDiagnostic output structure in a future infrastructure ship. window.runDistributionBacktest(), window.runBlendBacktest(), and window.runGameMarcelDiagnostic() are all in-browser console-only diagnostic utilities, not part of the gated harness.

Top-level runners

./run_all.sh — Runs every harness and reports a unified pass/fail tally. Split into READ_ONLY_HARNESSES (parallel-safe) and MUTATING_HARNESSES (smoke alone, runs after) since r220 to eliminate a race where multiple harnesses read polluted index.html during smoke's inject windows. ~38 seconds wall time. Exits non-zero on any failure.
./run_all.sh # standard run
./run_all.sh --verbose # show each harness output

./preship.sh — Hard gate before packaging. Runs every check that could fail silently at deploy time: syntax, all tests, version sync (APP_BUILD vs CACHE_NAME), structural integrity, critical runtime sanity, bundle size drift. 31 of 31 gates green = safe to package. Failure = DO NOT SHIP.
./preship.sh # full run (~50s)
./preship.sh --skip-tests # syntax + structural only (fast)
./preship.sh --verbose # show output of each step

Test harnesses (9 total, 5646 assertions as of r235)

smoke_test.js — Full-pipeline integrity. Drives analyzer → sendBetToCard → saveData → autoGrade → reconcile end-to-end with mocked MLB schedule responses. 131 phases organized by ship; latest is Phase 131 (r235 dev tooling docs). Catches regressions in canonicalization, gamePk plumbing, DH detection, reconciliation drift, pattern-ID validity. The MUTATING harness — runs alone in batch 2.

qa_test.js — Validates canonical TEAMS module + API URL shapes + alias coverage. Loads app into VM context, exercises TEAMS helpers on every known abbr form, grep-checks source for regressions (inline abbr maps sneaking back in), validates every URL we build against documented shapes. ~2300 assertions.

f5_test.js — F5 (first-five-innings) auto-fill flow. Mocks the per-game odds endpoint, verifies F5 totals/ML/RL parse correctly, validates isPlausibleF5Line bounds.

grader_test.js — Exercises every bet-type branch of the grader (ML, RL, totals, F5, props, parlays) and every backfill path. Catches misclassification bugs like the r113 RL fav/dog detection break.

impact_test.js — Pattern impact + calibration history. Validates computePatternImpact aggregates plays + research records correctly, addCalibrationEntry appends with stats snapshot, autoDetectCalibrationChanges logs drift only when prior entries exist.

integration_test.js — Simulates the full Fetch Games → Fetch Odds chain with mocked API responses. Verifies opening lines populate correctly across the analyzer pipeline.

migration_test.js — Validates F5 line migration + other data-shape migrations against seeded appData + localStorage + _scoreCache. Complements the regex-based QA assertions with actual round-trip behavior.

odds_api_url_test.js — Validates the Odds API URLs embedded in the app against the v4 docs. Prevents 422 regression bugs (invalid bookmakers, wrong endpoint path, non-featured markets on /odds).

persist_test.js — Save/load/reset round-trip with a real DOM-ish mock. Catches the "user changed UI, refreshed, change gone" class of bug that r195/r200 surfaced.

Static audits (12 total, run as part of preship)

audit_collection_schemas.py — Catches the bug class that hit r185: a field is READ from a persistent collection's members but not WRITTEN by every push site. Result: silent undefined at read-time.

audit_css_vars.py — Catches silent-color-fallthrough: stylesheet uses an undefined custom property like --undefined-token when the actual variable was renamed. Browser silently falls through to inherited color (often invisible black-on-black).

audit_fetch_safety.py — Forward-looking gate. Verifies all fetch sites are defended via try/catch, .catch() chain, or Promise.allSettled.

audit_helper_callers.py — Catches the bug class that hit r180: a widely-used helper takes an optional last param, some callers pass it, others don't. The non-passing callers silently get wrong behavior.

audit_inline_handlers.py — Catches silent-button-breakage. An onclick attribute referencing a renamed or deleted function still parses fine in HTML, but clicks silently do nothing — no console error to most users. The audit cross-checks every inline handler attribute against the JS surface.

audit_migrations.py — Validates migrations follow the strict shape: flag-key idempotency, structured return, error handling. Catches the "this migration ran twice and corrupted data" bug class.

audit_multi_writer.py — Detects the bug class that hit F5 line writes 4 times (r73, r84, r88, r95): same logical field gets written from 3+ code paths with subtly different schemas, one gets it wrong.

audit_patterns_firing.py — Static analysis of analyzer fire sites. Five classes: A=mutual exclusion, B=shared-state hierarchy, C=untrackable fires (no pattern ID), D=adjustment-without-firing, E=against-pattern ternaries (r109).

audit_persistence.py — Hunts the bug class that produced r195/r200: mutations to appData.pendingCard.plays inside a function that doesn't call saveData() or _debouncedSaveData(). Silent symptom — change disappears on refresh.

audit_score_cache.py — Hunts the bug class that produced migratePrematureGrades (r93): score-cache lookups without a date in the key leading to cross-day collisions.

audit_settings.py — Catches the bug class that hit r188 (BvP filter): a boolean-like localStorage setting read at multiple sites with INCONSISTENT default handling.

Notes

All harnesses load /tmp/ptb/app.js, which is regenerated by run_all.sh from the inline <script> blocks in index.html. Audits read index.html directly. Test baseline grew from 2949 (r232) → 5646 (r235) — most recent additions cover Marcel infrastructure (Phase 129) + site documentation parity (Phase 130 r234, Phase 131 r235).

Install as App

Add Play The Board to your home screen for a full-screen app experience — no App Store required, completely free.

Betting Unit Settings

Unit Size ($)

Amount per unit (default $50)

Unit Mode

Units = Amount at Risk

6u at -110 → risk $300, win $272.73

Units = To Win Amount

6u at -110 → win $300, risk $330

Same-Game Correlation Discount

Auto-discount correlated bets on same game

When the analyzer fires F5 + Full totals same direction (or ML+RL on same team), reduce sizing per industry SGP correlation factors. Prevents over-leveraging a single thesis. Recommended on.

F5+Full totals: 0.7× | ML+RL same team: 0.65×
F5 ML + Full ML same team: 0.6× | F5 ML + RL: 0.7×

P&L at a different unit size

See what your season P&L would be at a different unit size. Everything is tracked in units, so this just recomputes the dollar total — it does not change your saved unit.

Try a unit size ($)

Change Password

Current Password

New Password

Confirm

Weight Calibration

Adjust key model weights here — changes save to browser localStorage and apply immediately on next analysis. Useful once you have 20+ plays of data. Default values are from Sports Insights research and MLB historical data.

ADMIN ONLY All 157 MODEL_WEIGHTS — calibration vs research benchmarks

Loading weight table...