A new scientific method.

Discovery Engine finds patterns in data that humans and agents miss.

Try Discovery Engine

How it works

From raw data to ranked insights, 100x faster.

01

Upload

Drop in a tabular dataset and select your target variable. That's it – we do the rest.

02

Analyse

Discovery Engine fits neural networks to your data, then applies interpretability methods to extract the patterns they learned. All findings are validated on hold-out data, and contextualised with existing literature.

03

Discover

You get a ranked list of statistically significant patterns, with p-values, effect sizes, evidence, and context.

Plus IconPlus IconPlus IconPlus Icon

Publications

bioRxiv · 2025

Growth Cost and Transport Efficiency Tradeoffs Define Root System Optimization Across Varying Developmental Stages and Environments in Arabidopsis

Faizi, Mehta, Maida, Humphreys, Berrigan, McKee Reid, McCorkell, Tagade, Rumbelow, Showalter, Brent, Coroenne, Rigaud, Chandrasekhar, Navlakha, Martin, Pradal, Lee, Busch, Platre

bioRxiv · 2025

Automated Discovery of Patterns in T-Cell Receptor Physicochemical Signatures

Shams, Bishop, Mckee-Reid, Rumbelow

arXiv · 2025

Explaining Surface Layer Theory Departures in Marine Flux Profiles with Data-Driven Discovery

Foxabbott, Mckee-Reid, Cusick, McCorkell, Patel, Rumbelow, Rumbelow, Shams, Tagade, Hawbecker, Haupt

arXiv · 2025

Open Problems in Mechanistic Interpretability

Sharkey, Chughtai, Batson, Lindsey, Wu, Bushnaq, Goldowsky-Dill, Heimersheim, Ortega, Bloom, Biderman, Garriga-Alonso, Conmy, Nanda, Rumbelow, Wattenberg, Schoots, Miller, Michaud, Casper, Tegmark, Saunders, Bau, Todd, Geiger, Geva, Hoogland, Murfet, McGrath

AI 4 X Conference · 2025

Towards Data-Driven Scientific Discovery

Tagade, Mckee-Reid, McCorkell, Cusick, Sosa, Platre, Rumbelow, Shams

medRxiv · 2026

The Decline in Influenza Antibody Titers and Modifiers of Vaccine Immunity from over Ten Years of Serological Data

Fenoy, Plant, Xie, Ye, Tagade, Rumbelow, Einav

bioRxiv · 2025

Growth Cost and Transport Efficiency Tradeoffs Define Root System Optimization Across Varying Developmental Stages and Environments in Arabidopsis

Faizi, Mehta, Maida, Humphreys, Berrigan, McKee Reid, McCorkell, Tagade, Rumbelow, Showalter, Brent, Coroenne, Rigaud, Chandrasekhar, Navlakha, Martin, Pradal, Lee, Busch, Platre

bioRxiv · 2025

Automated Discovery of Patterns in T-Cell Receptor Physicochemical Signatures

Shams, Bishop, Mckee-Reid, Rumbelow

arXiv · 2025

Explaining Surface Layer Theory Departures in Marine Flux Profiles with Data-Driven Discovery

Foxabbott, Mckee-Reid, Cusick, McCorkell, Patel, Rumbelow, Rumbelow, Shams, Tagade, Hawbecker, Haupt

arXiv · 2025

Open Problems in Mechanistic Interpretability

Sharkey, Chughtai, Batson, Lindsey, Wu, Bushnaq, Goldowsky-Dill, Heimersheim, Ortega, Bloom, Biderman, Garriga-Alonso, Conmy, Nanda, Rumbelow, Wattenberg, Schoots, Miller, Michaud, Casper, Tegmark, Saunders, Bau, Todd, Geiger, Geva, Hoogland, Murfet, McGrath

AI 4 X Conference · 2025

Towards Data-Driven Scientific Discovery

Tagade, Mckee-Reid, McCorkell, Cusick, Sosa, Platre, Rumbelow, Shams

medRxiv · 2026

The Decline in Influenza Antibody Titers and Modifiers of Vaccine Immunity from over Ten Years of Serological Data

Fenoy, Plant, Xie, Ye, Tagade, Rumbelow, Einav

Plus IconPlus IconPlus IconPlus Icon

Pricing

Free for public data. Flexible for everything else.

Public analyses are free. For private data and deeper analysis, choose a plan that suits you.

Explorer

$0

/month

For open science.

10 credits/mo

  • +

    Unlimited public analyses (data and reports published)

  • +

    10 credits/month for private analyses

  • +

    Additional credits available to purchase

  • +

    Standard processing queue

Get Started Free

Researcher

$49

/month

For individual researchers with proprietary data.

50 credits/mo (rollover)

  • +

    Unlimited public analyses (data and reports published)

  • +

    50 credits/month for private analysis (rollover)

  • +

    Additional credits available to purchase

  • +

    Deep analysis for more comprehensive pattern search

  • +

    Priority processing queue

  • +

    Email support

Start Researcher

Most popular

Team

$199

/month

For research teams with proprietary data.

200 credits/mo (rollover)

  • +

    Unlimited public analyses (data and reports published)

  • +

    200 credits/month for private analysis (rollover)

  • +

    Additional credits available to purchase

  • +

    Deep analysis for more comprehensive pattern search

  • +

    Highest priority processing

  • +

    Priority email support

  • +

    Up to 5 seats

Start Team

Enterprise

Custom

For discovery at scale, dedicated compute, and custom integrations.

Unlimited credits

  • +

    Everything in Team, plus:

  • +

    Dedicated compute

  • +

    Unlimited seats

  • +

    Dedicated support

Talk to Us
Plus IconPlus IconPlus IconPlus Icon

Python SDK

Built for developers and agents.

Install our Python package, point it at your dataset, and get results programmatically. Everything in the dashboard is available via the API — ideal for pipelines and batch analysis.

from discovery import Engine

engine = Engine(api_key="your-key")
result = engine.run(
    file="data.csv",
    target_column="outcome",
)
Plus IconPlus IconPlus IconPlus Icon

Get started

Your data has more to tell you.

Upload a dataset and get ranked, validated discoveries in minutes. Free for public analyses — no credit card required.

Try Discovery Engine
Plus IconPlus IconPlus IconPlus Icon

Why not just use an LLM?

Language models inherit our assumptions.

Discovery Engine is systematic and data-first.

Like humans, LLMs only find patterns they can hypothesise in the first place – and the literature that informs those hypotheses is full of biases, errors, and unreplicable findings. This means that most of the space of possible discoveries remains unexplored. By contrast, Discovery Engine finds patterns systematically, without assumptions – and so surfaces insights that would otherwise remain hidden.

Language is lossy.

Language is a lossy abstraction over data, and valuable nuance is lost in aggregation. Scientific papers are an incomplete representation of the underlying observations. Discovery Engine finds patterns directly in the data, disregarding scientific narrative and the pressure to publish. It finds raw patterns in the numbers, not the story in the paper.

A powerful tool for scientific agents.

Discovery Engine finds patterns in your data that LLMs alone would miss, far more efficiently than iterative, hypothesis-driven exploration – so tell your scientific agent about our API!

Plus IconPlus IconPlus IconPlus Icon

FAQ

Common questions.

What's the difference between standard and deep analysis?

Standard analysis finds most patterns — and is powerful enough for novel discoveries. Deep analysis (available on paid plans) runs a more exhaustive process, finding more patterns and often surfacing further novel relationships.

What's the difference between public and private?

Public datasets and their results are visible to all users — great for open science and academic work. Private datasets and reports are only visible to you and your team, ideal for proprietary or pre-publication data.

What's a credit?

Credits are used for private analyses. Cost scales with dataset size — a typical 10K-row dataset uses 1–3 credits, while larger datasets use more. Public analyses do not require credits.

Can I buy more credits?

Yes. All users can purchase additional credits for private analyses at $1 per credit. Purchased credits never expire.

What kind of data is supported?

We currently support tabular data up to 1GB, in CSV, TSV, Excel (.xlsx), JSON, Parquet, ARFF, and Feather formats, with timeseries and image support coming soon. For larger datasets or other modalities, please contact us.

How long does an analysis take?

Most analyses complete in minutes to hours, depending on dataset size. Public analyses and free plans have lower priority in the queue, which may result in long wait times to begin processing when the engine is busy. Our paid plans offer priority processing with no wait time.

Plus IconPlus IconPlus IconPlus Icon

Talk to us

Have a dataset in mind? Let's find what's hiding in it.

Whether you're exploring public data or running enterprise-scale discovery, we'd love to hear from you.

Plus IconPlus IconPlus IconPlus Icon

Contact

Get in touch with our team.