The word “theory” means something very different in science than it does in everyday speech. In casual use, a theory is a guess, an untested hunch. In science, a theory is the highest category of explanation: a well-tested framework that accounts for a broad range of observations, makes specific predictions, and has survived repeated attempts at falsification. Confusing these two meanings creates one of the most persistent misunderstandings about how science works.
When someone says “evolution is just a theory,” they mean “just a guess.” Scientists who study evolution hear something different: evolution is a well-tested explanatory framework supported by converging evidence from paleontology, genetics, comparative anatomy, direct observation, and molecular biology. The word “theory” in that sentence is an endorsement, not a doubt.
What, then, separates a scientific theory from a non-scientific one? The answer involves several overlapping criteria, none of which is absolutely decisive on its own, but which together define the practice of science.
Testability and Falsifiability

The most important criterion for a scientific claim is that it must be testable, and specifically, it must be possible to test it in a way that could show it to be false. This principle is most associated with philosopher Karl Popper, who introduced the concept of falsifiability in the 1930s.
Falsifiability does not mean a claim has been falsified. It means that specific observations, if they occurred, would count as evidence against it. The theory of general relativity predicts that light bends around massive objects by a specific amount. If astronomers had measured a different amount during the 1919 solar eclipse observations, that would have been evidence against general relativity. It is falsifiable.
A claim like “everything happens for a reason” or “the universe was created by an undetectable intelligence” is not falsifiable because no observation could, in principle, count as evidence against it. Any result can be explained as consistent with the claim. This makes such claims non-scientific, not necessarily wrong, but not operating by the rules of science.
Importantly, falsifiability is not a perfect boundary. Some scientific theories are difficult to test directly. String theory and certain multiverse models make predictions that may not be empirically accessible with current or foreseeable technology. Popper himself acknowledged that the boundary between science and non-science is not perfectly sharp. The demarcation problem (drawing the exact line between science and non-science) remains a genuine philosophical challenge.
Predictive Power
A strong scientific theory does not merely explain what we already know; it predicts what we have not yet observed. When a theory correctly predicts novel phenomena, it provides much stronger support than when it merely accommodates existing data.
General relativity predicted the precession of Mercury’s orbit before it was precisely measured. It predicted gravitational waves, which were not confirmed until 2015, a century after Einstein’s prediction. It predicted black holes, which were widely resisted until observational evidence became unavoidable.
The germ theory of disease predicted that infections could be transmitted between organisms and prevented by interrupting the transmission chain. These predictions were not only confirmed; they became the basis for modern medicine, surgery, and public health.
A theory that can predict new observations, and whose predictions are confirmed, earns credibility that accumulates with each successful test. A theory that is revised to accommodate every new piece of data after the fact (without ever having predicted the data in advance) is weaker, even if it fits all known observations.
Explanatory Scope and Parsimony

Scientific theories are evaluated partly by how much they explain with how little they assume. Two theories that both fit all known observations are not equally good if one requires far more ad hoc assumptions than the other. This is the principle of parsimony, often called Occam’s Razor: prefer the simpler explanation when the evidence does not compel the more complex one.
The heliocentric model of the solar system was not accepted solely because it fit the data. Early versions of the heliocentric model, as proposed by Copernicus, were not significantly better at predicting planetary positions than the geocentric model with epicycles. It was the subsequent development of Kepler’s elliptical orbits and Newton’s gravitational theory that made heliocentrism clearly superior, not just as a fit to observations, but as a physically coherent framework that explained why orbits have the shapes they have.
A scientific theory should ideally explain diverse phenomena under a single unifying framework. Newton’s laws of motion explained both falling apples and planetary orbits. Darwin’s theory of natural selection explained both the fossil record and the distribution of species across geography. Einstein’s general relativity explained gravity, the expansion of the universe, and the behavior of light.
Evidence and Reproducibility
Scientific claims must be supported by evidence, specifically, by evidence that is objective (not dependent on who is doing the observation), reproducible (the same results can be obtained by independent investigators), and proportionate to the claim. Extraordinary claims require extraordinary evidence.
Reproducibility is a foundation of scientific trust. If an experiment cannot be replicated by independent researchers, its results are unreliable regardless of how impressive they appeared initially. The replication crisis that emerged in psychology and social science in the 2010s revealed that many published findings could not be reproduced, prompting a reexamination of statistical practices and publication incentives across sciences.
Evidence must be distinguished from anecdote. A single observation, however compelling, is weak evidence for a general claim. Pattern recognition across multiple independent lines of evidence (converging on the same conclusion) is much stronger. The evidence for evolution, for example, includes the fossil record, comparative genomics, observed speciation events, geographic distribution of species, molecular phylogenetics, and direct experimental evolution. These lines of evidence are independent of each other and yet converge on the same explanation.
The Difference Between a Theory and a Law
Scientific laws and theories are often confused, with laws portrayed as somehow more certain or fundamental than theories. This is a misconception. Laws describe what happens under specific conditions, usually expressed mathematically. Theories explain why it happens.
Newton’s law of gravitation describes the mathematical relationship between mass and gravitational force. General relativity is the theory that explains why that relationship exists and what gravity actually is. The law is a description; the theory is an explanation. Neither is more “proven” than the other; both are scientific claims supported by evidence.
What Science Cannot Do
Scientific methods are extraordinarily powerful within their domain, but that domain has boundaries. Science addresses empirical questions, questions answerable by observation and experiment. It cannot directly address questions of ultimate purpose, moral obligation, aesthetic value, or metaphysical ultimacy.
What Makes a Theory Scientific: A Working Definition
No single criterion fully captures what makes a theory scientific, but the combination of falsifiability, empirical support, predictive power, and coherence with established science comes closest to a working definition. The boundary is not perfectly sharp; philosophers of science have debated it for over a century, with no answer satisfying everyone. But the core is clear: a scientific theory must make predictions that could, in principle, be tested and falsified.
The mistake is conflating “not scientific” with “false.” Many important questions (why suffering matters morally, what makes a life meaningful, whether mathematical objects exist independently of minds) are not scientific questions. Science does not dismiss them. It simply does not have tools for them. Grasping these boundaries is not just an academic exercise. It is the basis for evaluating every claim, from medicine to climate science to cosmology.
What is a scientific theory?
In science, a theory is a well-tested explanatory framework that accounts for a wide range of observations, makes specific testable predictions, and has survived repeated attempts at falsification. A scientific theory is not a guess; it is the highest explanatory category in science. Evolution, germ theory, and general relativity are all scientific theories in this sense.
What is falsifiability and why does it matter?
Falsifiability, introduced by philosopher Karl Popper, is the principle that a scientific claim must be testable in a way that could, in principle, show it to be false. A claim is falsifiable if there exists some possible observation that would count as evidence against it. If no possible observation could challenge a claim, it is unfalsifiable and therefore not scientific, not necessarily wrong, but not operating within the methods of empirical science.
Is there a difference between a scientific theory and a scientific law?
Yes. A scientific law describes a regular relationship or pattern, often expressed mathematically (e.g., Newton’s law of gravitation). A scientific theory explains why that pattern exists and provides a mechanistic framework. Laws describe; theories explain. Neither is more certain than the other; both are supported by evidence and can in principle be revised.
Can science prove anything?
Science does not prove claims in the mathematical sense. It tests claims by gathering evidence, which can increase or decrease confidence in a claim. No amount of positive evidence can make a scientific claim certain, because future evidence could in principle falsify it. This is sometimes called fallibilism, the recognition that even well-established scientific knowledge is in principle open to revision. This is a strength of science, not a weakness.
What is the demarcation problem?
The demarcation problem is the philosophical challenge of drawing a precise boundary between science and non-science. While falsifiability is the most commonly cited criterion, some genuine science (like aspects of cosmology or string theory) is difficult to test directly, and some non-science is technically falsifiable but does not operate as science in practice. Most philosophers of science now favor a family-resemblance approach: scientific claims share a cluster of features (testability, explanatory power, reproducibility, parsimony) rather than meeting a single necessary condition.
Sources
Popper, K.R. (1959). The Logic of Scientific Discovery. Hutchinson.
Kuhn, T.S. (1962). The Structure of Scientific Revolutions. University of Chicago Press.
Lakatos, I. (1978). The Methodology of Scientific Research Programmes. Cambridge University Press.
Sagan, C. (1995). The Demon-Haunted World: Science as a Candle in the Dark. Random House.
Ioannidis, J.P.A. (2005). Why Most Published Research Findings Are False. PLOS Medicine, 2(8), e124. doi:10.1371/journal.pmed.0020124
Soler, L., Zwart, S., Lynch, M., & Israel-Jost, V. (Eds.) (2012). Science After the Practice Turn in the Philosophy, History, and Social Studies of Science. Routledge.
This article is part of our framework exploring Knowledge — how science, mathematics, and language help us understand reality.
