Prediction-market activity alarm

Start here — the mental model

One question: should a human investigate this market now?

Answer: treat it like a smoke alarm, not an oracle.

Recommended default

Loading…

What the score means

Notice the alarm

Score ≥40, strict Level‑2, or several sibling markets moving together means “stop and inspect”.

Look for public explanation

Check news, official statements, and whether the market is moving only because the deadline is near.

Group related markets

Ten Iran rows are one Iran alarm family, not ten independent proofs.

Escalate only if unexplained

If high score + no obvious public reason + multiple related markets: send a heads-up for human review.

Shortest version: “Market probability jumped + real money/trades arrived + buyers were mostly on Yes” is interesting, but only after the red-team checks: false positives at looser thresholds, correlated sibling markets, near-certainty settlement effects, and public-news explanations.

Start with current markets

Active alerts right now

Filter

Loading…

What the current analysis says

Summary metrics

Proposed signal levels

Level 1 / monitor: unusual volume plus price lift. Good for “something is moving; check news/source quality now.”
Level 2 / informed-flow: Level 1 plus positive Yes-side pressure and not-yet-certain price. In the current sample this fired rarely but cleanly.
Context filter: group sibling markets, down-rank scheduled deadline/settlement arbitrage, and snapshot public news at alert time.

Honest interpretation of “9/10”

The backtest produced — for the strict rule and — for a softer rule. That is stronger than 9/10 in this dataset, but it is not yet an out-of-sample reliability claim because many rows are sibling markets around the same events.

Reality check / red-team view

Where the signal is fragile or misleading

This is the anti-fairy-tale section: red rows, missed rows, clusters, and uncertainty.

Looser thresholds create many false positives

Stacked bars: green = true event, red = false alarm. This is why “watchlist” is not “prediction”.

Default alerts are clustered, not independent

Rows selected by the current score threshold, grouped by coarse topic family.

Plain-language caveat

Loading…

How to avoid fooling ourselves

Look at false positives just below the default threshold, not only the clean high-score rows.
Count topic families before counting market rows; Iran sibling markets are highly correlated.
Treat near-certainty/deadline markets as possible settlement mechanics, not early warning.
Require an external-news snapshot when a live alert is escalated.

Examples the simple story hides

Metric explainer

What exactly is the alarm score measuring?

It is a weighted anomaly score, not a calibrated probability.

Score formula used in the dashboard

score = price level + price lift + relative lift + flow surge + flow size + net Yes pressure

The market percentage itself is p(t): Yes price ≈ market-implied probability. The “derivative” part is Δp over a fixed window, mostly p(−6h) − p(−24h). Because that window is 18 hours, its slope proxy is Δp / 18h. The score also uses cash-flow features, because price-only jumps can be stale or deadline mechanics.

Score components on real rows

Stacked bars show which inputs created the 0–100 score. Red rows are false alarms / noisy rows.

Used vs not used

Important interpretation

Score 40 does not mean 40% probability. It means enough unusual things happened together to justify human review. A market can be 60% because everyone already knows the news; that is different from a 10%→35% move with new Yes-flow and volume surge.

Probability over time

Market percent, derivative proxy, and early-warning horizon

Resolved rows: x-axis is hours before market close/resolution, not guaranteed true event time.

Probability paths before resolution

Faint lines = all resolved rows with data; thick lines = illustrative high-score / false-positive examples.

Derivative proxy: percentage-points per hour

Each dot is Δp from −24h to −6h divided by 18 hours. Larger bubbles have more 6h flow.

If I want warning hours/days before, how noisy is raw probability?

Drag the lead-time slider. Raw probability here means only the market Yes price at that snapshot; it does not use volume, account history, time-decay, or a model formula. Bars and KPIs show TP/FP/FN trade-offs for that simple price threshold.

Desired warning lead 2 days before

Discrete raw snapshots available: 1h, 2h, 3h, 6h, 12h, 1d, 2d, 3d before close/resolution.

Answer for that warning lead

Loading…

If I only need ~2 hours of heads-up: false positives and misses

Loading…

Concrete error examples: negative false positives and Yes misses

Same event, different deadlines

Term structure: by-date markets are not interchangeable

Shows current active sibling markets; full pair scanner is linked separately.

Why this matters: “Event by May 24” and “event by June 15” can both be about the same story but carry different cumulative probabilities. Convert them to comparable per-day hazard to see whether the market thinks risk is front-loaded or just spread over more time.

Event family Open full deadline-pair scanner →

Cumulative event probability by deadline

For survival-style markets, this chart uses event risk = 1 − Yes price.

Per-day hazard and theta-adjusted move

Hazard normalizes different deadlines; theta-adjusted move asks whether probability rose beyond mechanical time decay.

Rows behind the selected term curve

Alarm loudness

Choose how much activity is enough to alert

Score is descriptive/in-sample: price + recent flow + surge + Yes pressure.

If I set the alarm here, what happened historically?

Green = hit rate; pale band = uncertainty; blue bars = number of alerts.

Which past markets would have set it off?

Green ended Yes. Red is a false alarm. Orange line is your current threshold.

Confidence levels in this dataset

Why the sliders look “too good”

Loading…

Rows selected by the current score

Advanced / optional

Exact rule knobs behind the alarm

Recomputes on — resolved markets with features.

Skip this section on first read. It is for stress-testing the alarm. If a slider combination looks perfect but only selects a few related rows, it is not a reliable law of nature.

Resolved-market scatter

Orange points are rows selected by the current line.

Rule diagnostics: sample size and clustering

Loading…

Sensitivity around the current slider settings

Markets selected by the current rule

User-requested case

Ukraine invasion warning: how much heads up?

Event date: 2022-02-24

Forecast timeline before invasion

Community aggregate points vs high-intensity forecaster points; sourced reconstruction, not authenticated API raw dump.

Practical alert thresholds

Interpretation: an aggregate ≥50% was a “weeks ahead, take contingency planning seriously” signal; ≥60% on Feb 13 was a “leave Kyiv / prepare now” signal; expert/intensive monitoring reached 90%+ roughly 12 days out.

Ukraine evidence points

Pre-public timing check

Did markets move before public timestamps?

Positive minutes = before public timestamp; negative = after.

First crossing ≥80% relative to public source time

Bars are clipped to keep the chart readable; full values are in the table.

Timing rows

Examples and controls

Case studies

Auditability

Raw data and caveats

Local files used

Critical caveats

Backtest windows are often anchored to Polymarket close/resolution; only selected rows have independent public-news anchors.
Markets are not independent. Sibling markets around Iran/Venezuela can inflate confidence and p-values.
Old Ukraine 2022 Polymarket CLOB data was not available through current endpoints; Ukraine section uses sourced public forecasting points.
This is a monitoring/triage dashboard, not proof of insider trading and not trading advice.