Gage R&R: What It Is, How to Run One, and What the Numbers Mean
Before you can improve a process, you have to be able to measure it. And before you trust your measurements, you need to know how much of the variation you are seeing comes from the measurement system itself.
That is what a Gage R&R study does. It separates measurement variation from process variation so you know whether your numbers mean anything.
## The Two Rs
R&R stands for Repeatability and Reproducibility.
**Repeatability** is variation within one operator. If the same person measures the same part ten times, how much do the readings vary? This is the gage's variation. A worn instrument, loose fixtures, or poor resolution all show up here.
**Reproducibility** is variation between operators. If three different people measure the same part, do they get the same answer? Differences in technique, training, or interpretation show up here.
Together, these two sources make up your measurement system variation (GRR).
## Why It Matters
Say your process has a total observed variation of 10 units. Your spec tolerance is 20 units. Looks fine, right?
But if your Gage R&R shows that 6 of those 10 units of variation come from measurement, then your actual process variation is much smaller than you thought. You might be rejecting good parts or accepting bad ones based on measurement noise.
On the flip side, if your measurement variation is tiny compared to the process, you can trust the data you are collecting on your control charts, capability studies, and experiments.
## How to Run a Gage R&R
The standard approach (crossed Gage R&R):
**Setup:**
- Select 10 parts that span the range of your process (not 10 identical parts)
- Choose 3 operators (the people who normally do the measuring)
- Each operator measures each part 3 times
- That is 10 x 3 x 3 = 90 measurements total
**Critical details:**
- Randomize the measurement order. Do not let operators measure parts 1 through 10 in sequence three times. Mix them up so the operator does not remember previous readings.
- Use parts from actual production. Do not cherry-pick parts that are all the same size.
- Do not tell operators which part number they are measuring. Label the parts with codes, not sequential numbers.
- Measure under normal conditions. If the study is done in a climate-controlled lab but actual measurements happen on the shop floor, your results will not transfer.
## Reading the Results
The key output is **%GRR** (also called %Study Variation or %Tolerance, depending on what you compare against).
| %GRR | Interpretation |
|-------|---------------|
| Under 10% | Measurement system is excellent |
| 10% to 30% | May be acceptable depending on the application |
| Over 30% | Measurement system needs improvement |
These thresholds come from AIAG guidelines and are widely used in automotive and manufacturing. Some industries use tighter or looser standards.
**%GRR vs %Tolerance:**
- **%Study Variation** compares GRR to total observed variation. Use this when you want to understand how much of what you see is measurement vs process.
- **%Tolerance** compares GRR to the spec range. Use this when you want to know if the measurement system can tell good parts from bad.
Both are useful. Report both.
## What the ANOVA Table Tells You
If you are using the ANOVA method (recommended over the Range method for better accuracy), you will get an ANOVA table breaking down variation sources:
- **Part-to-Part** variation between the parts you selected. This should be the dominant source. If it is not, your parts did not span enough of the process range.
- **Repeatability** gage variation (within operator)
- **Reproducibility** operator variation (between operators)
- **Operator x Part interaction** some operators may measure certain parts differently than others. If this is significant, there is a technique or training issue tied to specific part characteristics.
## Number of Distinct Categories (ndc)
This is the number of non-overlapping confidence intervals that span the process variation. In simpler terms, it tells you how many groups your measurement system can reliably distinguish.
- **ndc >= 5** is the goal
- **ndc < 5** means your measurement system cannot reliably sort parts into enough groups for meaningful process control
If your ndc is 2, your measurement system can basically only tell "big" from "small." It cannot distinguish fine gradations, which limits what your SPC charts and capability studies can show you.
## When It Fails
If your %GRR is too high, here is where to look:
**High repeatability (gage problem):**
- Gage resolution too coarse (rule of thumb: resolution should be 1/10th of the tolerance)
- Worn or damaged instrument
- Fixturing does not hold the part consistently
- Environmental factors (temperature, vibration)
**High reproducibility (operator problem):**
- Operators using different techniques
- Unclear measurement instructions
- Different interpretation of where to measure
- Training gaps
**Fix the measurement system before you try to improve the process.** If you cannot measure it right, you cannot improve it right. Running DOE or capability studies on top of a bad measurement system gives you garbage data dressed up in fancy statistics.
## Practical Tips
Start with the measurement systems that matter most. Your critical-to-quality characteristics, the ones tied to customer specs or safety requirements. Do not try to do Gage R&R on everything at once.
Run the study annually, or whenever you change gages, train new operators, or modify the measurement procedure.
Keep the raw data. If a question comes up later about measurement capability, you want to be able to pull up the actual study, not just a pass/fail summary.