
How to choose the right sample size for your research
Sample size is one of the most common points of confusion in research planning. Get it wrong in either direction and your findings suffer. Here is how to think about it across surveys, qualitative interviews, and usability testing.
The question that comes up on every project
At some point in almost every research project, someone asks: how many do we need?
For quantitative work, the answer involves statistics. For qualitative work, it involves theory. For usability testing, it involves a body of empirical research that most practitioners have encountered but few have read in full. In all three cases, the reasoning is different, and treating them the same produces either over-engineered studies or findings that cannot support the conclusions drawn from them.
This article lays out how to think about sample size across the three most common research contexts, and why the logic differs between them.
Why sample size matters in both directions
The instinct, usually from clients or stakeholders, is to assume bigger is always better. More data, more confidence. That is true up to a point, but the relationship is not linear.
For a quantitative survey, doubling your sample size does not double your precision. The margin of error shrinks with the square root of sample size; to halve your margin of error, you need to quadruple your sample. At a certain point, the incremental gain in precision is negligible and the cost is not.
In the other direction, an undersized sample produces findings that cannot be defended. A survey of 50 people claiming to represent a population of 50,000 is not a survey. It is an anecdote with a spreadsheet attached.
The goal is not the largest possible sample. It is the smallest sample that will produce findings you can stand behind.
Quantitative surveys: the statistical foundation
For surveys, sample size is a function of three things: how confident you want to be in your results, how much error you are willing to accept, and how variable you expect the responses to be.
Confidence level is typically set at 95%, meaning that if you ran the same survey 100 times, 95 of those times your result would fall within the margin of error. Some studies use 90% (less stringent, smaller sample) or 99% (more stringent, larger sample).
Margin of error is how much your result can differ from the true population value. A ±5% margin of error means that if 60% of your sample say they prefer option A, the true population figure is somewhere between 55% and 65%. A tighter margin requires a larger sample.
Variability is handled conservatively by assuming maximum variability (p=0.5) unless you have strong prior evidence that responses will cluster at one end. This is the most common approach and the most defensible.
The formula that ties these together is the Cochran formula, which produces the required sample size for an infinite or unknown population. When your population is small and well-defined (say, 500 enterprise IT decision-makers in a specific sector), a finite population correction applies, which reduces the required sample somewhat.
One additional factor that is often underestimated: response rate. The sample size formula tells you how many completed responses you need. If you expect 30% of the people you invite to respond, you need to invite roughly three times as many as your target sample. Building this adjustment into planning avoids the common situation where fieldwork closes with too few responses to support the analysis.
Qualitative interviews: theoretical saturation, not statistics
Qualitative research does not use statistical formulas to determine sample size. The concept of statistical significance does not apply to unstructured interviews or focus groups. The goal is not to estimate a population parameter but to understand a phenomenon in depth.
The relevant concept is theoretical saturation: the point at which new interviews stop generating new themes or insights. Once you are hearing the same ideas repeatedly and new conversations are confirming rather than extending your understanding, you have reached saturation.
The practical implication is that qualitative sample sizes are determined by research design, not by formulas. The key variables are:
Research goal. Exploratory research (generating hypotheses, mapping a new territory) requires a broader range of perspectives and typically more interviews before saturation is reached. Confirmatory research (testing a specific proposition) reaches saturation faster because you are looking for evidence on a defined question rather than building a picture from scratch.
Audience homogeneity. A homogeneous audience (people with similar roles, backgrounds, and experiences) tends to reach saturation with fewer interviews because the range of perspectives is narrower. A heterogeneous audience requires more interviews to capture the full range of relevant experience.
Research literature (Guest, Bunce & Johnson, 2006; Creswell, 2013) provides evidence-based ranges rather than fixed numbers. The table below summarises typical ranges by research goal and audience type.
| Research goal | Homogeneous audience | Mixed audience | Heterogeneous audience |
|---|---|---|---|
| Exploratory | 6–12 | 12–20 | 20–30 |
| Descriptive | 10–15 | 15–25 | 25–40 |
| Confirmatory | 8–12 | 12–18 | 18–25 |
These are guidelines, not rules. The right approach is to plan for the lower end of the range and continue until you are confident saturation has been reached, not to hit a number and stop regardless of what the data is telling you.
Usability testing: Nielsen’s finding and its limits
Usability testing has its own body of evidence on sample size, most prominently Nielsen’s 2000 finding that five users uncover approximately 85% of usability problems in a typical interface. This has become one of the most cited and most misapplied findings in research practice.
The five-user figure applies to obvious usability problems in a single interface tested with a single user group. It does not apply equally to subtle issues, edge cases, complex systems, or studies that span multiple distinct user groups.
The relevant variables for usability sample size are:
Study goal. Discovery studies (finding problems for the first time) can start lean. Five participants will surface the most critical issues. Validation studies (confirming that previously identified problems have been fixed) benefit from a slightly larger sample to give confidence that the fix holds across a broader range of users.
Problem visibility. Obvious problems (those that most users encounter immediately) are found quickly with small samples. Subtle problems (edge cases encountered by a minority of users) require more sessions to surface reliably.
| Study goal | Obvious problems | Moderate problems | Subtle problems |
|---|---|---|---|
| Discovery | 3–5 | 5–8 | 8–12 |
| Validation | 5–8 | 8–12 | 12–20 |
A practical note on usability testing that the numbers alone do not capture: iterative rounds are more efficient than a single large study. Five users, fix the issues, five more users, fix again. Each round targets the problems the previous round surfaced. This approach finds more issues with fewer total sessions than a single large round.
The three logics side by side
flowchart TD
A[What kind of research?] --> B[Quantitative survey]
A --> C[Qualitative interviews]
A --> D[Usability testing]
B --> B1["Confidence level
Margin of error
Population size"]
B1 --> B2["Statistical formula
Cochran + FPC"]
B2 --> B3["Fixed number
of completed responses"]
C --> C1["Research goal
Audience homogeneity"]
C1 --> C2["Evidence-based ranges
from saturation theory"]
C2 --> C3["Range to aim for
stop at saturation"]
D --> D1["Study goal
Problem visibility"]
D1 --> D2["Nielsen research
and empirical benchmarks"]
D2 --> D3["Start lean
iterate in rounds"] What goes wrong in practice
The most common failure modes are worth naming directly.
Applying survey logic to qualitative work. A client who asks for “statistical significance” in a qualitative study is asking for something the methodology cannot provide. Qualitative research produces depth, not precision. The question to answer is whether saturation was reached, not whether the sample is representative in a statistical sense.
Treating Nielsen’s five-user finding as a universal rule. Five users is a starting point for obvious problems in a simple interface. For a complex system with multiple user groups, or for subtle accessibility issues, five users is not enough. The finding is real but it has conditions.
Ignoring response rate in survey planning. A study designed for 400 completed responses that invites 500 people, expecting 80% response, will fall short if the actual response rate is 40%. Response rate assumptions need to be built into the invitation count, not treated as a post-hoc adjustment.
Stopping at a number rather than a state. In qualitative work, the goal is saturation, not a headcount. Stopping at 12 interviews because that was the plan, when the data is still producing new themes at interview 10, produces findings that are thinner than they should be.
A note on what sample size cannot fix
Sample size is a necessary condition for good research, not a sufficient one. A well-sized survey with a poorly designed questionnaire produces precise answers to the wrong questions. A qualitative study with the right number of interviews but a non-purposive sample produces saturated findings that do not represent the relevant population.
Sampling quality (who you include and how you reach them) matters as much as sampling quantity. The two questions to answer before finalising sample size are: how many do I need, and how do I ensure the people I recruit are the right ones?
If you want to work through sample size for a specific project, our sample size calculator covers all three research types with plain-English explanations of the reasoning behind each recommendation.