At Sturdy Statistics, our models are probabilistic, structured, and deeply recursive — Bayesian DAGs that encode how information flows through the data. Our models are not arbitrary graphs; they have constraints that make them regular and, with the right representation, very fast.
When we began building our engine, we wanted complete control over data layout in memory, so that we could express our structure directly. The best fit wasn’t something new, but something old and disciplined: C. (Technically, C17; it’s not that old. But the language evolves slowly, and the small, deliberate additions like _Static_assert and _Generic helped us out.)
Struct Meets Structure
C’s design mirrors the structure of our models almost perfectly. We represent each node in a Bayesian DAG as a struct, and each dependency as a pointer — so the graph in memory looks exactly like the one on paper. There’s no abstraction layer to cross, and no framework needed to translate our intent into a data model.
That simplicity has real advantages. Because our models are acyclic and fixed in shape, our data structures don’t mutate; only their values do. As a result, we can allocate nearly everything at startup and keep it until shutdown. We worked hard to make sure almost nothing ever needs to be reallocated. The result is a codebase that nearly saturates the CPU, is easy to reason about, and performs consistently.
From Graphs to Loops
By packing nodes into contiguous arrays, we replaced pointer-chasing with stride-based memory access, which modern CPUs handle extremely well. Then we realized we could go even further.
No matter how complex the original graph, our proprietary algorithm νmix reduces each inference iteration to a two-dimensional loop: one compact dimension, and one over the number of topics. Careful layout ensures each iteration reads a compact, aligned block that fits entirely in the CPU cache. Those loops are automatically vectorized by the compiler, and the predictability helps out the CPU’s prefetcher.
We also spent time with pen-and-paper, rewriting our core equations until the inner step reduced to a combination of integer operations and fused-multiply-add (FMA) updates.
It’s a good reminder that performance often comes not from clever tricks, but from disciplined data layout and from predictable control flow.
Predictability Over Dynamism
C gives us that predictability. We avoid dynamic memory wherever possible, and we don’t rely on garbage collection, reference counting, or hidden allocators. Every object’s lifetime is explicit: if it’s allocated, we know when, why, and how much.
This constraint forces discipline. By designing our system around stable structures, we eliminated entire classes of bugs. It’s a relief to know exactly what the program is doing, at every moment. This engine runs every training job and on nearly every API query — across multiple servers and clouds, wherever data needs to be processed. Despite the code’s complexity and its central role in everything we do, in practice we don’t worry about it at all; it’s by far the most solid part of our entire stack.
Simple to Teach, Simple to Learn
The language has a reputation for being difficult, but in our experience the opposite is true: we have found C to be very easy to teach. Its syntax is small, its standard library is compact, and its semantics are straightforward once you understand memory. Moreover, everything in the language corresponds to something concrete: there are values, pointers, blocks of memory, and function calls. We’ve found that, if someone has a background in mathematics, he or she can start writing great C code in less than a week.
There’s very little to hide or to abstract away. The same directness that makes C efficient for machines also makes it clear for novice programmers. At the same time, LLVM and Clang make advanced work very pleasant. Their static analyzers and sanitizers give us the same visibility we’d expect in a higher-level language, but without sacrificing control.
The Performance of Clarity
We didn’t choose C out of nostalgia. We chose it because, for our problem, clarity and performance turned out to be the same thing.
C makes our models transparent. It forces us to express every dependency, every allocation, every update explicitly — and it rewards that discipline with speed. C’s direct memory model helped us write good code by keeping cause and effect visible: when something is slow, you can see exactly why, and fix it.
Abstraction often promises efficiency, but in our experience it can deliver complexity instead. We’ve found that the simplest tools — when used judiciously — can be the most efficient. C may be old, but it remains a language of great clarity and power.