Introduction/Background We have all been there. As a PhD student, you complete a field measuring as much as was humanly possible, only for one of your committee as one member to sagely raise an eyebrow and say, “Why didn’t you measure that? It could confound your entire story!” As a Post Doc, you are handed a novel data set with which to make your career, but you swiftly notice it is missing a key confounding variable and you lack a time machine to go back and measure it. As a PI, you are ready to test your theories are applicable across large spatial and temporal scales, but measurement of key elements of the system are simply not possible at scale. In each of these cases, unobserved variables can lead to statistical bias of estimates from observational data analyses – and incorrect conclusions from data – whether we are dealing with known unknowns or unknown unknowns. Such challenges posed by confounding variables is one of the primary reasons for the primacy of experiments over observations as valid pathways to causal knowledge in Ecology. Can we derive clean clear causal inference with observational data in the face of such problems? Here, we both demonstrate how omitted variables can create biased estimate in analyses of observational data and discuss methods developed in other fields to eliminate the problem Omitted Variable Bias.
Results/Discussion We show how to use Directed Acyclic Graphs (DAGs) to determine possible confounding variables and how to remove their influence. We show how techniques such as group mean centering, Mundlak devices, a renewed appreciation of fixed effects, differencing, and more can flexibly account for omitted variables in any study design. Rather than presenting these as specialized techniques, we demonstrate how these solutions grow naturally from DAGs, leading to a more flexible way of handling omitted variable bias. Finally, we show how these readily available approaches can resolve problems caused by omitted variable bias in both a simulated intertidal and real-world kelp forest data set. We hope that this will introduce Ecologists to new techniques in their arsenal to make the most out of hard-won observational data for use in causal analysis.