When Less Is More: The Benefits of Training Exclusively on Your Own Data

trustworthy AI
data provenance
AI security
dataset poisoning
Author

Mike McCourt

Published

January 14, 2026

Anthropic’s recent results remind us that the quality and traceability of data matter more than quantity.

In early October, Anthropic released a study with the UK AI Safety Institute and the Alan Turing Institute showing that a few hundred poisoned samples can reliably implant a backdoor in large language models (LLMs) — even when those models are trained on trillions of tokens. Their unassuming title, “A small number of samples can poison LLMs of any size,” barely hints at the depth of the implications.

Since dataset poisoning is caused by tainted training examples, most observers assumed its effects would dilute away with the size of the training data. Intuitively, it seems impossible to meaningfully corrupt a dataset consisting of trillions of tokens.

But Anthropic’s results overturn this intuition: the researchers found that model size does not protect against poisoning. Whether a model had 600M parameters or 13B, inserting roughly 250 poisoned documents was enough to implant a persistent backdoor. The absolute number of poisoned samples mattered more than their fraction of the training set: in one case, tainting just 0.00016% of the data was enough to cause a behavior change in the model.

Though it’s a startling finding, in hindsight it seems almost obvious: models trained under “Chinchilla-optimal” scaling use extra data to extract specificity, not redundancy. In this paradigm, more data doesn’t imply more robustness. And scale does not buy safety.

Anthropic was careful not to claim that existing foundation models are compromised. Moreover, their backdoor was intentionally narrow and experimental. Yet the work raises a critical question:

How well do we actually know the data our AI models learn from?

When training sets are measured in terabytes, with raw data pulled from the open internet, we have no meaningful way to audit them. Anthropic’s research should give us pause about depending on such models.

Two Philosophies of Learning

The current generation of AI largely depends on scale: harvest the entire internet, build a massive model, and distill general linguistic ability from sheer exposure. (Or, perhaps, memorize the dataset.)

At Sturdy Statistics, we’ve taken a different approach. We train models exclusively on a client’s own data (such as emails, transcripts, reviews, or other documents), and we never use external corpora or pretraining. Instead of learning language by imitation, our models learn structure through mathematically defined priors derived from linguistic theory.

These priors encode formal knowledge about syntax, semantics, and statistical relationships. They come from human scholarship, not internet text, and they give our models a principled, interpretable inductive bias that allows generalization from small datasets. Where LLMs pursue breadth, we pursue fidelity.

Because our systems are Bayesian, every parameter has a meaning and every inference can be traced back to specific source data. If the model misclassifies something, we can identify why: we know which prior contributed, which words updated the posterior, and how the model’s beliefs shifted in response to the data.

This transparency extends to the data itself. When a client trains a Sturdy Statistics model, the full training set is known, finite, and auditable. We don’t hide any pretraining corpus and, consequently, there are no surprise behaviors inherited from internet data.

Anthropic’s findings highlight why this matters. If you know every line of data your model sees, dataset poisoning becomes practically impossible. And when parameters are interpretable, even small behavioral shifts can be diagnosed and corrected rather than guessed at.

The Virtue of Small, Known Data

Anthropic’s paper shows how easily large systems can be nudged by small adversarial perturbations. The same logic applies in reverse: small, well-characterized datasets can produce remarkably stable models.

Our systems can infer meaningful structure from only a few hundred words (a few thousand is better), because the inductive bias comes from the priors — not from massive amounts of raw text. Because of this built-in inductive bias, our AI doesn’t need to learn “how language works” from scratch; it begins with a mathematically grounded model of language and updates that model based on the client’s data alone.

A Complementary Vision of AI

Anthropic’s results show that even the largest AI systems can be swayed by tiny, unseen influences in their training data. If you don’t control your corpus, you don’t control your model. Unfortunately, no amount of scale can compensate for that.

That’s why our approach is different. By training solely on your own data, with interpretable Bayesian models, we give you:

  • control over your data pipeline
  • transparency into every parameter
  • protection from poisoning and drift
  • models whose behavior you can explain to auditors, regulators, and customers

Trustworthy AI begins with knowing your data. Sturdy Statistics makes that possible.