Your Cpk Is a Point Estimate. That Should Terrify You.

There is a particular species of confidence that pervades the quality engineering profession, and it tends to show up most reliably in the moment a Cpk value is reported to a customer, stamped onto a PPAP, or projected in front of a room full of people who have never once asked **how much they should trust that number.** The number is 1.47. Or 1.62. Or 0.98. It is delivered with a precision that suggests certainty, interpreted with a finality that forecloses further inquiry, and used to make decisions whose financial and safety consequences have nothing whatsoever to do with how that number was derived. This is the state of process capability analysis in most organizations today, and it is built on a fiction. --- ## The Fiction A Cpk of 1.47 calculated from 30 subgroups of size 5 is not a fact about a process. It is *an estimate, derived from a sample, subject to uncertainty that is almost never quantified and almost never communicated.* The frequentist confidence interval for that 1.47 at 95% confidence is approximately **1.20 to 1.74**. That is not a narrow range. A Cpk of 1.20 and a Cpk of 1.74 imply fundamentally different defect rates, fundamentally different risk profiles, and fundamentally different answers to the question *"is this process capable?"* At 1.74, you are comfortably above the 1.33 threshold and probably sleeping well. At 1.20, you are barely capable and the expected defect rate has increased by an order of magnitude. The point estimate told you neither of these things. It told you 1.47, and you moved on. A point estimate with no interval is an opinion dressed as a measurement. --- ## Why This Happens The answer is mundane and therefore embarrassing: the tools don't make it easy. Minitab will compute a confidence interval for Cpk if you click Options and select "Include confidence intervals" — a setting that is off by default and that most users have never found. JMP buries the calculation behind a similar workflow. Excel has no native capability analysis at all, and the spreadsheet heroes who have built their own calculators have almost universally omitted interval estimation because the formulas are not straightforward. The result is an entire profession making decisions on point estimates because the software they pay $1,600/year for has defaulted away from showing them how uncertain those estimates actually are. And the confidence interval, when it *is* computed, has its own problems. --- ## The Problem with Frequentist Confidence Intervals for Cpk The standard confidence interval for Cpk relies on an approximation that assumes: - The process is in a state of statistical control - The underlying data are normally distributed - The sample size is large enough for the chi-squared approximation to hold These are reasonable assumptions in a textbook. They are **rarely verified in practice** and frequently violated in the messy reality of production data where shifts drift, operators rotate, raw material lots change, and the distribution has tails that no one has looked at since the initial capability study. More fundamentally, the frequentist confidence interval answers a question that nobody is actually asking. It says: *"If I repeated this sampling procedure an infinite number of times, 95% of the resulting intervals would contain the true Cpk."* The quality engineer standing in front of the customer does not have infinite repetitions. They have one dataset and one decision to make right now. What they actually want to know is: *"Given my data, what is the probability that the true Cpk exceeds 1.33?"* That is a Bayesian question. And it has a Bayesian answer. --- ## Bayesian Cpk: What You Actually Wanted All Along A Bayesian approach to process capability replaces the point estimate with a **posterior distribution** — a full probability distribution over the possible values of Cpk, given the observed data and whatever prior knowledge exists about the process. From this posterior, you can extract: • A **credible interval** — the range within which the true Cpk falls with a given probability (e.g., there is a 95% probability that the true Cpk lies between 1.18 and 1.71). Unlike the frequentist confidence interval, this statement means exactly what it sounds like it means. • **P(Cpk > 1.33)** — the probability that the process is actually capable. Not "the process is capable" or "the process is not capable," but a number between 0 and 1 that reflects the strength of evidence. A process with P(Cpk > 1.33) = 0.92 is telling you something materially different from one with P(Cpk > 1.33) = 0.54, and yet both might report the same point estimate. • **Decision-relevant risk quantification.** If you are shipping parts to an automotive OEM with a contractual Cpk ≥ 1.67 requirement, you can compute P(Cpk > 1.67) directly from the posterior. If the answer is 0.38, you know exactly where you stand — and you know it *before* the customer's incoming inspection tells you the hard way. A Bayesian credible interval says "there is a 95% probability the true value is in this range." A frequentist confidence interval does NOT say this, despite the fact that almost everyone interprets it this way. This distinction is not academic. It is the difference between an answer to the question you asked and an answer to a question you didn't. --- ## The Prior Is Not a Dirty Word The most common objection to Bayesian methods in manufacturing is the prior. The argument goes: *"Bayesian statistics is subjective because it requires you to specify a prior distribution, and I want my analysis to be objective."* This argument sounds reasonable and is almost entirely wrong. First, the notion that a frequentist Cpk analysis is "objective" requires you to ignore the fact that someone chose the sample size, the subgroup structure, the sampling frequency, and the time window. These are all judgment calls that influence the result. The prior makes the judgment **explicit and auditable** rather than buried in decisions nobody documented. Second, weakly informative priors — priors that express *"the process probably has a Cpk somewhere between 0 and 3"* — have negligible influence on the posterior once you have even a modest amount of data. With 50+ observations, the data overwhelms the prior. The prior is not driving the conclusion. The data is. Third, and most importantly for practitioners: you already have prior knowledge and you are already using it. When a process engineer looks at a Cpk of 1.47 and says *"that seems about right for this machine"*, they are applying a prior. When a quality manager sees a Cpk of 3.8 and says *"that can't be right, check the data"*, they are applying a prior. The difference is that they are applying it informally, after the analysis, without documentation, and without any mechanism to integrate it with the evidence. Bayesian capability analysis simply moves that judgment to where it belongs: *before* the analysis, documented, transparent, and subject to the same rigor as the data itself. --- ## What This Looks Like in Practice Consider a process with 25 subgroups of size 5, a specification of 25.000 ± 0.030 mm, and a reported Cpk of 1.34. The classical analysis says: *"Cpk = 1.34. The process is capable."* The Bayesian analysis says: - **Posterior mean Cpk:** 1.31 - **95% credible interval:** [1.06, 1.58] - **P(Cpk > 1.33):** 0.47 That is a 47% probability of meeting the 1.33 threshold. Not "capable." Not "not capable." Forty-seven percent. A coin flip with a slight bias toward disappointment. The classical point estimate and the Bayesian posterior mean are close. The difference is that the Bayesian analysis told you **what you didn't know** and the classical analysis hid it behind a single number. Now consider what happens when you collect more data. With 100 subgroups of size 5: - **Posterior mean Cpk:** 1.36 - **95% credible interval:** [1.24, 1.49] - **P(Cpk > 1.33):** 0.64 The interval has tightened. The probability has increased. You are watching your uncertainty shrink in real time as data accumulates. The Bayesian framework rewards you for collecting more evidence and shows you exactly how much each additional sample buys. This is what rational decision-making looks like. This is what a measurement system for capability should have been doing all along. --- ## The Sample Size Problem Nobody Talks About Quality professionals routinely argue about minimum sample sizes for capability studies. Some organizations require 30 samples. Some require 50. Automotive PPAP guidelines suggest specific subgroup counts. None of these conventions are derived from a principled analysis of how much uncertainty the organization is willing to accept. The Bayesian framework dissolves this argument entirely. With a Bayesian posterior, you can compute the width of the credible interval at any sample size. If your organization requires P(Cpk > 1.33) ≥ 0.90 before shipment, you collect data until the posterior meets that threshold. The "minimum sample size" is not a convention — it is a consequence of your risk tolerance and the variability of your process. A noisy process requires more data to achieve the same level of certainty. A tight process achieves it faster. The framework adjusts automatically because it is doing what statistics was always supposed to do: **quantifying what you know and what you don't.** If you cannot articulate the uncertainty in your capability estimate, then you do not have a capability estimate. You have a number. --- ## TL;DR A Cpk point estimate without a measure of uncertainty is an incomplete analysis being used to make complete decisions. This is not a theoretical complaint. It is a structural deficiency in how the profession operates, enabled by tools that default to hiding the uncertainty and a culture that has never demanded otherwise. Bayesian process capability analysis provides: 1. A full posterior distribution over Cpk — not a single number, but a quantification of everything the data supports 2. Credible intervals that mean what people think confidence intervals mean 3. Direct computation of P(Cpk > threshold) for any threshold your customer, standard, or contract requires 4. A principled framework for deciding when you have enough data, driven by your risk tolerance rather than arbitrary convention The math is not new. Bayesian capability assessment has been published in peer-reviewed literature since the early 2000s. What is new is that the tools to perform it no longer require a PhD, a MATLAB license, or a weekend of programming. The question is not whether this approach is better. The question is how long you intend to keep making decisions on point estimates while pretending the uncertainty doesn't exist. --- ### Try Svend's Bayesian Cpk Calculator Compute P(Cpk > 1.33) with credible intervals — free, no login required. Or go deeper with the full [Bayesian SPC suite](/app/spc/) in the platform. [Free Bayesian Cpk Calculator →](/tools/bayesian-cpk-calculator/) [Classical vs Bayesian SPC — Full Comparison →](/classical-vs-bayesian-spc/) [Svend vs Minitab →](/svend-vs-minitab/)

Try Svend Free