Confounding

Short description: Bias in causal inference

In causal inference, a confounder is traditionally understood to be a variable that (1) independently predicts the outcome (or dependent variable), (2) is associated with the exposure (or independent variable), and (3) is not on the causal pathway between the exposure and the outcome.^[1]^[2]^[3] Failure to control for a confounder results in a spurious association between exposure and outcome.

Confounding is a causal concept rather than a purely statistical one, and therefore cannot be fully described by correlations or associations alone^[4]. The presence of confounders helps explain why correlation does not imply causation, and why careful study design and analytical methods (such as randomization, statistical adjustment, or causal diagrams) are required to distinguish causal effects from spurious associations.

Several notation systems and formal frameworks, such as causal directed acyclic graphs (DAGs), have been developed to represent and detect confounding, making it possible to identify when a variable must be controlled for in order to obtain an unbiased estimate of a causal effect.

Confounders are threats to internal validity^[5].

Definition

Confounding is defined in terms of the data generating model. Let X be an exposure (or independent variable), and let Y be the outcome (or dependent variable). Traditionally, a variable Z was considered to confound the relationship between X and Y if Z (1) independently predicts Y, (2) is associated with X, and (3) is not on the causal pathway between X and Y^[1]^[2]^[3]. Not controlling for Z introduces a spurious relationship between X and Y.

However, several developments in causal inference over the past decades have shown that this definition of confounding is inadequate^[6]^[7]. This is because there can be pre-exposure variables associated with the outcome that, when controlled for, introduce rather than eliminate bias.

Modern causal inference therefore typically defines a confounder in terms of the minimally sufficient adjustment set.^[8]^[9] Formally, a set of variables Z is a sufficient adjustment set for the effect of X on Y if, conditional on Z, the potential outcomes are independent of X. I.e. after adjusting for Z, the exposed and unexposed groups are exchangeable with respect to the outcome. A minimally sufficient adjustment set is an adjustment set Z where every member of Z is required to control for confounding. Under this framework, a confounder is defined as a member of the minimally sufficient adjustment set.

In the language of directed acyclic graphs, confounding corresponds to the presence of one or more open backdoor paths between X and Y ^[10]. A set of variables Z is a sufficient adjustment set if conditioning on Z blocks all backdoor paths from X to Y. The set is minimally sufficient if no proper subset of Z satisfies this property. Removing any variable from a minimally sufficient set reopens at least one backdoor path.

Examples

Simple Example

A trucking company compares the fuel economy of trucks from two manufacturers (“A” and “B”) by measuring miles per gallon over one month. They find that A trucks appear more fuel-efficient. However, A trucks are more often assigned highway routes while B trucks are more often assigned city routes. Here, truck make is the independent variable, MPG is the dependent variable, and route type (or proportion of city driving) is the confounder. Because route type affects MPG and the route type differs across truck make, it confounds the comparison. Thus the observed difference likely reflects highway vs. city driving rather than truck make.

Relationship between birth order and Down Syndrome

A scientist is studying the relationship between birth order (1st child, 2nd child, etc.) and the presence of Down syndrome in the child. However, it is known that:

Higher maternal age is directly associated with Down Syndrome in the child
Higher maternal age is directly associated with Down Syndrome, regardless of birth order (a mother having her 1st vs 3rd child at age 50 confers the same risk)
Maternal age is directly associated with birth order (the 2nd child, except in the case of twins, is born when the mother is older than she was for the birth of the 1st child)
Maternal age is not a consequence of birth order (having a 2nd child does not change the mother's age)

In this scenario, maternal age is a confounding variable, as it influences both the independent variable (birth order), and the dependent variable (Down syndrome).

Relationship between smoking and lung disease

A scientist is studying the relationship between smoking status (smoker vs. non-smoker) and the presence of lung disease. However, it is known that:

Alcohol consumption and diet are directly associated with lung disease and overall health.
Alcohol consumption and diet affect health regardless of smoking status (a smoker and non-smoker with similar alcohol use and diet may have similar health risks).
Alcohol consumption and diet are associated with smoking status (smokers are, on average, more likely to drink alcohol or have less healthy diets than non-smokers).
Alcohol consumption and diet are not consequences of smoking itself (smoking does not necessarily cause higher alcohol intake or poor diet, even if they are correlated).

In this scenario, alcohol consumption or diet is a confounding variable, because it influences both the independent variable (smoking status) and the dependent variable (health outcome). If these factors are not controlled, the observed association between smoking and lung disease may be partly or entirely due to differences in alcohol use or diet rather than smoking itself.

Control

Consider a researcher attempting to assess the effectiveness of drug X, from population data in which drug usage was a patient's choice. The data shows that gender (Z) influences a patient's choice of drug as well as their chances of recovery (Y). In this scenario, gender Z confounds the relation between X and Y since Z is a cause of both X and Y:

Causal diagram of Gender as common cause of Drug use and Recovery

We have that

$P (y ∣ do (x)) \neq P (y ∣ x)$

(2)

because the observational quantity contains information about the correlation between X and Z, and the interventional quantity does not (since X is not correlated with Z in a randomized experiment). It can be shown^[11] that, in cases where only observational data is available, an unbiased estimate of the desired quantity $P (y ∣ do (x))$ , can be obtained by "adjusting" for all confounding factors, namely, conditioning on their various values and averaging the result. In the case of a single confounder Z, this leads to the "adjustment formula":

$P (y ∣ do (x)) = \sum_{z} P (y ∣ x, z) P (z)$

(3)

which gives an unbiased estimate for the causal effect of X on Y. The same adjustment formula works when there are multiple confounders except, in this case, the choice of a set Z of variables that would guarantee unbiased estimates must be done with caution. The criterion for a proper choice of variables is called the Back-Door^[11]^[12] and requires that the chosen set Z "blocks" (or intercepts) every path between X and Y that contains an arrow into X. Such sets are called "Back-Door admissible" and may include variables which are not common causes of X and Y, but merely proxies thereof.

Returning to the drug use example, since Z complies with the Back-Door requirement (i.e., it intercepts the one Back-Door path $X \leftarrow Z \to Y$ ), the Back-Door adjustment formula is valid:

$\begin{aligned} P (Y = recovered ∣ do (x = give drug)) = & P (Y = recovered ∣ X = give drug, Z = male) P (Z = male) \\ + P (Y = recovered ∣ X = give drug, Z = female) P (Z = female) \end{aligned}$

(4)

In this way the physician can predict the likely effect of administering the drug from observational studies in which the conditional probabilities appearing on the right-hand side of the equation can be estimated by regression.

Contrary to common beliefs, adding covariates to the adjustment set Z can introduce bias.^[13] A typical counterexample occurs when Z is a common effect of X and Y,^[14] a case in which Z is not a confounder (i.e., the null set is Back-door admissible) and adjusting for Z would create bias known as "collider bias" or "Berkson's paradox." Controls that are not good confounders are sometimes called bad controls.

In general, confounding can be controlled by adjustment if and only if there is a set of observed covariates that satisfies the Back-Door condition. Moreover, if Z is such a set, then the adjustment formula of Eq. (3) is valid.^[11]^[12] Pearl's do-calculus provides all possible conditions under which $P (y ∣ do (x))$ can be estimated, not necessarily by adjustment.^[15]

History

According to Morabia (2011),^[16] the word confounding derives from the Medieval Latin verb "confundere", which meant "mixing", and was probably chosen to represent the confusion (from Latin: con=with + fusus=mix or fuse together) between the cause one wishes to assess and other causes that may affect the outcome and thus confuse, or stand in the way of the desired assessment. Greenland, Robins and Pearl^[17] note an early use of the term "confounding" in causal inference by John Stuart Mill in 1843.

Fisher introduced the word "confounding" in his 1935 book "The Design of Experiments"^[18] to refer specifically to a consequence of blocking (i.e., partitioning) the set of treatment combinations in a factorial experiment, whereby certain interactions may be "confounded with blocks". This popularized the notion of confounding in statistics, although Fisher was concerned with the control of heterogeneity in experimental units, not with causal inference.

According to Vandenbroucke (2004)^[19] it was Kish^[20] who used the word "confounding" in the sense of "incomparability" of two or more groups (e.g., exposed and unexposed) in an observational study. Formal conditions defining what makes certain groups "comparable" and others "incomparable" were later developed in epidemiology by Greenland and Robins (1986)^[21] using the counterfactual language of Neyman (1935)^[22] and Rubin (1974).^[23] These were later supplemented by graphical criteria such as the Back-Door condition (Pearl 1993; Greenland, Robins and Pearl 1999).^[17]^[11]

Graphical criteria were shown to be formally equivalent to the counterfactual definition^[24] but more transparent to researchers relying on process models.

Types

In the case of risk assessments evaluating the magnitude and nature of risk to human health, it is important to control for confounding to isolate the effect of a particular hazard such as a food additive, pesticide, or new drug. For prospective studies, it is difficult to recruit and screen for volunteers with the same background (age, diet, education, geography, etc.), and in historical studies, there can be similar variability. Due to the inability to control for variability of volunteers and human studies, confounding is a particular challenge. For these reasons, experiments offer a way to avoid most forms of confounding.

In some disciplines, confounding is categorized into different types. In epidemiology, one type is "confounding by indication",^[25] which relates to confounding from observational studies. Because prognostic factors may influence treatment decisions (and bias estimates of treatment effects), controlling for known prognostic factors may reduce this problem, but it is always possible that a forgotten or unknown factor was not included or that factors interact complexly. Confounding by indication has been described as the most important limitation of observational studies. Randomized trials are not affected by confounding by indication due to random assignment.

Confounding variables may also be categorised according to their source. The choice of measurement instrument (operational confound), situational characteristics (procedural confound), or inter-individual differences (person confound).

An operational confounding can occur in both experimental and non-experimental research designs. This type of confounding occurs when a measure designed to assess a particular construct inadvertently measures something else as well.^[26]
A procedural confounding can occur in a laboratory experiment or a quasi-experiment. This type of confound occurs when the researcher mistakenly allows another variable to change along with the manipulated independent variable.^[26]
A person confounding occurs when two or more groups of units are analyzed together (e.g., workers from different occupations), despite varying according to one or more other (observed or unobserved) characteristics (e.g., gender).^[27]

Decreasing the potential for confounding

A reduction in the potential for the occurrence and effect of confounding factors can be obtained by increasing the types and numbers of comparisons performed in an analysis. If measures or manipulations of core constructs are confounded (i.e. operational or procedural confounds exist), subgroup analysis may not reveal problems in the analysis. Additionally, increasing the number of comparisons can create other problems (see multiple comparisons).

Peer review is a process that can assist in reducing instances of confounding, either before study implementation or after analysis has occurred. Peer review relies on collective expertise within a discipline to identify potential weaknesses in study design and analysis, including ways in which results may depend on confounding. Similarly, replication can test for the robustness of findings from one study under alternative study conditions or alternative analyses (e.g., controlling for potential confounds not identified in the initial study).

Confounding effects may be less likely to occur and act similarly at multiple times and locations. In selecting study sites, the environment can be characterized in detail at the study sites to ensure sites are ecologically similar and therefore less likely to have confounding variables. Lastly, the relationship between the environmental variables that possibly confound the analysis and the measured parameters can be studied. The information pertaining to environmental variables can then be used in site-specific models to identify residual variance that may be due to real effects.^[28]

Depending on the type of study design in place, there are various ways to modify that design to actively exclude or control confounding variables:^[29]

Case-control studies assign confounders to both groups, cases and controls, equally. For example, if somebody wanted to study the cause of myocardial infarct and thinks that the age is a probable confounding variable, each 67-year-old infarct patient will be matched with a healthy 67-year-old "control" person. In case-control studies, matched variables most often are the age and sex. Drawback: Case-control studies are feasible only when it is easy to find controls, i.e. persons whose status vis-à-vis all known potential confounding factors is the same as that of the case's patient: Suppose a case-control study attempts to find the cause of a given disease in a person who is 1) 45 years old, 2) African-American, 3) from Alaska, 4) an avid football player, 5) vegetarian, and 6) working in education. A theoretically perfect control would be a person who, in addition to not having the disease being investigated, matches all these characteristics and has no diseases that the patient does not also have—but finding such a control would be an enormous task.
Cohort studies: A degree of matching is also possible and it is often done by only admitting certain age groups or a certain sex into the study population, creating a cohort of people who share similar characteristics and thus all cohorts are comparable in regard to the possible confounding variable. For example, if age and sex are thought to be confounders, only 40 to 50 years old males would be involved in a cohort study that would assess the myocardial infarct risk in cohorts that either are physically active or inactive. Drawback: In cohort studies, the overexclusion of input data may lead researchers to define too narrowly the set of similarly situated persons for whom they claim the study to be useful, such that other persons to whom the causal relationship does in fact apply may lose the opportunity to benefit from the study's recommendations. Similarly, "over-stratification" of input data within a study may reduce the sample size in a given stratum to the point where generalizations drawn by observing the members of that stratum alone are not statistically significant.
Double blinding: conceals from the trial population and the observers the experiment group membership of the participants. By preventing the participants from knowing if they are receiving treatment or not, the placebo effect should be the same for the control and treatment groups. By preventing the observers from knowing of their membership, there should be no bias from researchers treating the groups differently or from interpreting the outcomes differently.
Randomized controlled trial: A method where the study population is divided randomly in order to mitigate the chances of self-selection by participants or bias by the study designers. Before the experiment begins, the testers will assign the members of the participant pool to their groups (control, intervention, parallel), using a randomization process such as the use of a random number generator. For example, in a study on the effects of exercise, the conclusions would be less valid if participants were given a choice if they wanted to belong to the control group which would not exercise or the intervention group which would be willing to take part in an exercise program. The study would then capture other variables besides exercise, such as pre-experiment health levels and motivation to adopt healthy activities. From the observer's side, the experimenter may choose candidates who are more likely to show the results the study wants to see or may interpret subjective results (more energetic, positive attitude) in a way favorable to their desires.
Stratification: As in the example above, physical activity is thought to be a behaviour that protects from myocardial infarct; and age is assumed to be a possible confounder. The data sampled is then stratified by age group – this means that the association between activity and infarct would be analyzed per each age group. If the different age groups (or age strata) yield much different risk ratios, age must be viewed as a confounding variable. There exist statistical tools, among them Mantel–Haenszel methods, that account for stratification of data sets.
Controlling for confounding by measuring the known confounders and including them as covariates is multivariable analysis such as regression analysis. Multivariate analyses reveal much less information about the strength or polarity of the confounding variable than do stratification methods. For example, if multivariate analysis controls for antidepressant, and it does not stratify antidepressants for TCA and SSRI, then it will ignore that these two classes of antidepressant have opposite effects on myocardial infarction, and one is much stronger than the other.

All these methods have their drawbacks:

The best available defense against the possibility of spurious results due to confounding is often to dispense with efforts at stratification and instead conduct a randomized study of a sufficiently large sample taken as a whole, such that all potential confounding variables (known and unknown) will be distributed by chance across all study groups and hence will be uncorrelated with the binary variable for inclusion/exclusion in any group.
Ethical considerations: In double-blind and randomized controlled trials, participants are not aware that they are recipients of sham treatments and may be denied effective treatments.^[30] There is a possibility that patients only agree to invasive surgery (which carry real medical risks) under the understanding that they are receiving treatment. Although this is an ethical concern, it is not a complete account of the situation. For surgeries that are currently being performed regularly, but for which there is no concrete evidence of a genuine effect, there may be ethical issues to continue such surgeries. In such circumstances, many of people are exposed to the real risks of surgery yet these treatments may possibly offer no discernible benefit. Sham-surgery control is a method that may allow medical science to determine whether a surgical procedure is efficacious or not. Given that there are known risks associated with medical operations, it is questionably ethical to allow unverified surgeries to be conducted ad infinitum into the future.

Criticism

Concerns have been raised that confounding in medical research can product false null results due to decreasing exposure reliability and increasing sibling-correlations.^[31]^[32]

Artifacts

Artifacts are variables that should have been systematically varied, either within or across studies, but that were accidentally held constant. Artifacts are thus threats to external validity. Artifacts are factors that covary with the treatment and the outcome. Campbell and Stanley^[33] identify several artifacts. The major threats to internal validity are history, maturation, testing, instrumentation, statistical regression, selection, experimental mortality, and selection-history interactions.

One way to minimize the influence of artifacts is to use a pretest-posttest control group design. Within this design, "groups of people who are initially equivalent (at the pretest phase) are randomly assigned to receive the experimental treatment or a control condition and then assessed again after this differential experience (posttest phase)".^[34] Thus, any effects of artifacts are (ideally) equally distributed in participants in both the treatment and control conditions.

References

↑ ^1.0 ^1.1 Rothman, Kenneth J.; Lash, Timothy L.; VanderWeele, Tyler J.; Haneuse, Sebastien (2021). Modern epidemiology (Fourth ed.). Philadelphia: Wolters Kluwer / Lippincott Williams & Wilkins. ISBN 978-1-4511-9328-2.
↑ ^2.0 ^2.1 Miettinen, Olli (1995-06-15). "Confounding and Effect-Modification" (in en). American Journal of Epidemiology 141 (12): 1113–1116. doi:10.1093/oxfordjournals.aje.a117384. ISSN 1476-6256. PMID 7771449. https://academic.oup.com/aje/article/148335/CONFOUNDING.
↑ ^3.0 ^3.1 Weinberg, Clarice R. (1993-01-01). "Toward a Clearer Definition of Confounding" (in en). American Journal of Epidemiology 137 (1): 1–8. doi:10.1093/oxfordjournals.aje.a116591. ISSN 1476-6256. PMID 8434568. https://academic.oup.com/aje/article/303163/Toward.
↑ Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). New York: Cambridge University Press. Chapter: "Simpson's Paradox, Confounding, and Collapsibility".
↑ Shadish, W. R.; Cook, T. D.; Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton-Mifflin.
↑ Pearl, Judea (2022). Causality: models, reasoning, and inference (Second edition, reprinted with corrections ed.). Cambridge New York, NY Port Melbourne New Delhi Singapore: Cambridge University Press. ISBN 978-0-521-89560-6.
↑ Greenland, S.; Pearl, J.; Robins, J. M. (January 1999). "Causal diagrams for epidemiologic research". Epidemiology (Cambridge, Mass.) 10 (1): 37–48. doi:10.1097/00001648-199901000-00008. ISSN 1044-3983. PMID 9888278.
↑ VanderWeele, Tyler J.; Shpitser, Ilya (2013-02-01). "On the definition of a confounder". The Annals of Statistics 41 (1): 196–220. doi:10.1214/12-AOS1058. ISSN 0090-5364. PMID 25544784.
↑ Rothman, Kenneth J.; Lash, Timothy L.; VanderWeele, Tyler J.; Haneuse, Sebastien (2021). Modern epidemiology (Fourth ed.). Philadelphia: Wolters Kluwer / Lippincott Williams & Wilkins. ISBN 978-1-4511-9328-2.
↑ Pearl, Judea (December 1995). "Causal Diagrams for Empirical Research". Biometrika 82 (4): 669–688. doi:10.2307/2337329. https://www.jstor.org/stable/2337329.
↑ ^11.0 ^11.1 ^11.2 ^11.3 Pearl, J., (1993). "Aspects of Graphical Models Connected With Causality", In Proceedings of the 49th Session of the International Statistical Science Institute, pp. 391–401.
↑ ^12.0 ^12.1 Pearl, J. (2009). Causal Diagrams and the Identification of Causal Effects In Causality: Models, Reasoning and Inference (2nd ed.). New York, NY, US: Cambridge University Press.
↑ Cinelli, C.; Forney, A.; Pearl, J. (March 2022). "A Crash Course in Good and Bad Controls". http://ftp.cs.ucla.edu/pub/stat_ser/r493.pdf.
↑ Lee, P. H. (2014). "Should We Adjust for a Confounder if Empirical and Theoretical Criteria Yield Contradictory Results? A Simulation Study". Sci Rep 4. doi:10.1038/srep06085. PMID 25124526. Bibcode: 2014NatSR...4.6085L.
↑ Shpitser, I.; Pearl, J. (2008). "Complete identification methods for the causal hierarchy". The Journal of Machine Learning Research 9: 1941–1979.
↑ Morabia, A (2011). "History of the modern epidemiological concept of confounding". Journal of Epidemiology and Community Health 65 (4): 297–300. doi:10.1136/jech.2010.112565. PMID 20696848. http://jech.bmj.com/content/jech/65/4/297.full.pdf.
↑ ^17.0 ^17.1 Greenland, S.; Robins, J. M.; Pearl, J. (1999). "Confounding and Collapsibility in Causal Inference". Statistical Science 14 (1): 31. doi:10.1214/ss/1009211805. Bibcode: 1999StaSc..1411805G.
↑ Fisher, R. A. (1935). The design of experiments (pp. 114–145).
↑ Vandenbroucke, J. P. (2004). "The history of confounding". Soz Praventivmed 47 (4): 216–224. doi:10.1007/BF01326402. PMID 12415925.
↑ Kish, L (1959). "Some statistical problems in research design". Am Sociol 26 (3): 328–338. doi:10.2307/2089381.
↑ Greenland, S.; Robins, J. M. (1986). "Identifiability, exchangeability, and epidemiological confounding". International Journal of Epidemiology 15 (3): 413–419. doi:10.1093/ije/15.3.413. PMID 3771081.
↑ Neyman, J., with cooperation of K. Iwaskiewics and St. Kolodziejczyk (1935). Statistical problems in agricultural experimentation (with discussion). Suppl J Roy Statist Soc Ser B 2 107-180.
↑ Rubin, D. B. (1974). "Estimating causal effects of treatments in randomized and nonrandomized studies". Journal of Educational Psychology 66 (5): 688–701. doi:10.1037/h0037350.
↑ Pearl, J., (2009). Causality: Models, Reasoning and Inference (2nd ed.). New York, NY, US: Cambridge University Press.
↑ Johnston, S. C. (2001). "Identifying Confounding by Indication through Blinded Prospective Review". American Journal of Epidemiology 154 (3): 276–284. doi:10.1093/aje/154.3.276. PMID 11479193.
↑ ^26.0 ^26.1 Pelham, Brett (2006). Conducting Research in Psychology. Belmont: Wadsworth. ISBN 978-0-534-53294-9.
↑ Steg, L.; Buunk, A. P.; Rothengatter, T. (2008). "Chapter 4". Applied Social Psychology: Understanding and managing social problems. Cambridge, UK: Cambridge University Press.
↑ Calow, Peter P. (2009) Handbook of Environmental Risk Assessment and Management, Wiley
↑ Mayrent, Sherry L. (1987). Epidemiology in Medicine. Lippincott Williams & Wilkins. ISBN 978-0-316-35636-7. https://archive.org/details/epidemiologyinme00henn.
↑ Emanuel, Ezekiel J; Miller, Franklin G (Sep 20, 2001). "The Ethics of Placebo-Controlled Trials—A Middle Ground". New England Journal of Medicine 345 (12): 915–9. doi:10.1056/nejm200109203451211. PMID 11565527.
↑ Gustavson, K.; Torvik, F. A.; Davey Smith, G.; Røysamb, E.; Eilertsen, E. M. (2024). "Familial confounding or measurement error? How to interpret findings from sibling and co-twin control studies". European Journal of Epidemiology 39 (6): 587–603. doi:10.1007/s10654-024-01132-6. PMID 38879863.
↑ Madrid-Valero, J. J.; Verhulst, B.; López-López, J. A.; Ordoñana, J. R. (2024). "Calculating Within-Pair Difference Scores in the Co-twin Control Design. Effects of Alternative Strategies". Behavior Genetics 54 (5): 426–435. doi:10.1007/s10519-024-10196-9. PMID 39177736.
↑ Campbell, D. T.; Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNally.
↑ Crano, W. D.; Brewer, M. B. (2002). Principles and methods of social research (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. p. 28.

External links

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Confounding. Read more

[:0-1] 1.0 ^1.1 Rothman, Kenneth J.; Lash, Timothy L.; VanderWeele, Tyler J.; Haneuse, Sebastien (2021). Modern epidemiology (Fourth ed.). Philadelphia: Wolters Kluwer / Lippincott Williams & Wilkins. ISBN 978-1-4511-9328-2.

[:1-2] 2.0 ^2.1 Miettinen, Olli (1995-06-15). "Confounding and Effect-Modification" (in en). American Journal of Epidemiology 141 (12): 1113–1116. doi:10.1093/oxfordjournals.aje.a117384. ISSN 1476-6256. PMID 7771449. https://academic.oup.com/aje/article/148335/CONFOUNDING.

[:2-3] 3.0 ^3.1 Weinberg, Clarice R. (1993-01-01). "Toward a Clearer Definition of Confounding" (in en). American Journal of Epidemiology 137 (1): 1–8. doi:10.1093/oxfordjournals.aje.a116591. ISSN 1476-6256. PMID 8434568. https://academic.oup.com/aje/article/303163/Toward.

[Pearl_2009-4] Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). New York: Cambridge University Press. Chapter: "Simpson's Paradox, Confounding, and Collapsibility".

[Shadish2002-5] Shadish, W. R.; Cook, T. D.; Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton-Mifflin.

[6] Pearl, Judea (2022). Causality: models, reasoning, and inference (Second edition, reprinted with corrections ed.). Cambridge New York, NY Port Melbourne New Delhi Singapore: Cambridge University Press. ISBN 978-0-521-89560-6.

[7] Greenland, S.; Pearl, J.; Robins, J. M. (January 1999). "Causal diagrams for epidemiologic research". Epidemiology (Cambridge, Mass.) 10 (1): 37–48. doi:10.1097/00001648-199901000-00008. ISSN 1044-3983. PMID 9888278.

[8] VanderWeele, Tyler J.; Shpitser, Ilya (2013-02-01). "On the definition of a confounder". The Annals of Statistics 41 (1): 196–220. doi:10.1214/12-AOS1058. ISSN 0090-5364. PMID 25544784.

[9] Rothman, Kenneth J.; Lash, Timothy L.; VanderWeele, Tyler J.; Haneuse, Sebastien (2021). Modern epidemiology (Fourth ed.). Philadelphia: Wolters Kluwer / Lippincott Williams & Wilkins. ISBN 978-1-4511-9328-2.

[10] Pearl, Judea (December 1995). "Causal Diagrams for Empirical Research". Biometrika 82 (4): 669–688. doi:10.2307/2337329. https://www.jstor.org/stable/2337329.

[Pearl_1993-11] 11.0 ^11.1 ^11.2 ^11.3 Pearl, J., (1993). "Aspects of Graphical Models Connected With Causality", In Proceedings of the 49th Session of the International Statistical Science Institute, pp. 391–401.

[pearl09-causal-diagrams-12] 12.0 ^12.1 Pearl, J. (2009). Causal Diagrams and the Identification of Causal Effects In Causality: Models, Reasoning and Inference (2nd ed.). New York, NY, US: Cambridge University Press.

[13] Cinelli, C.; Forney, A.; Pearl, J. (March 2022). "A Crash Course in Good and Bad Controls". http://ftp.cs.ucla.edu/pub/stat_ser/r493.pdf.

[14] Lee, P. H. (2014). "Should We Adjust for a Confounder if Empirical and Theoretical Criteria Yield Contradictory Results? A Simulation Study". Sci Rep 4. doi:10.1038/srep06085. PMID 25124526. Bibcode: 2014NatSR...4.6085L.

[15] Shpitser, I.; Pearl, J. (2008). "Complete identification methods for the causal hierarchy". The Journal of Machine Learning Research 9: 1941–1979.

[16] Morabia, A (2011). "History of the modern epidemiological concept of confounding". Journal of Epidemiology and Community Health 65 (4): 297–300. doi:10.1136/jech.2010.112565. PMID 20696848. http://jech.bmj.com/content/jech/65/4/297.full.pdf.

[Greenland_Robins_Pearl_1999-17] 17.0 ^17.1 Greenland, S.; Robins, J. M.; Pearl, J. (1999). "Confounding and Collapsibility in Causal Inference". Statistical Science 14 (1): 31. doi:10.1214/ss/1009211805. Bibcode: 1999StaSc..1411805G.

[18] Fisher, R. A. (1935). The design of experiments (pp. 114–145).

[19] Vandenbroucke, J. P. (2004). "The history of confounding". Soz Praventivmed 47 (4): 216–224. doi:10.1007/BF01326402. PMID 12415925.

[20] Kish, L (1959). "Some statistical problems in research design". Am Sociol 26 (3): 328–338. doi:10.2307/2089381.

[21] Greenland, S.; Robins, J. M. (1986). "Identifiability, exchangeability, and epidemiological confounding". International Journal of Epidemiology 15 (3): 413–419. doi:10.1093/ije/15.3.413. PMID 3771081.

[22] Neyman, J., with cooperation of K. Iwaskiewics and St. Kolodziejczyk (1935). Statistical problems in agricultural experimentation (with discussion). Suppl J Roy Statist Soc Ser B 2 107-180.

[23] Rubin, D. B. (1974). "Estimating causal effects of treatments in randomized and nonrandomized studies". Journal of Educational Psychology 66 (5): 688–701. doi:10.1037/h0037350.

[24] Pearl, J., (2009). Causality: Models, Reasoning and Inference (2nd ed.). New York, NY, US: Cambridge University Press.

[25] Johnston, S. C. (2001). "Identifying Confounding by Indication through Blinded Prospective Review". American Journal of Epidemiology 154 (3): 276–284. doi:10.1093/aje/154.3.276. PMID 11479193.

[Pelham-26] 26.0 ^26.1 Pelham, Brett (2006). Conducting Research in Psychology. Belmont: Wadsworth. ISBN 978-0-534-53294-9.

[27] Steg, L.; Buunk, A. P.; Rothengatter, T. (2008). "Chapter 4". Applied Social Psychology: Understanding and managing social problems. Cambridge, UK: Cambridge University Press.

[28] Calow, Peter P. (2009) Handbook of Environmental Risk Assessment and Management, Wiley

[Hennekens1987-29] Mayrent, Sherry L. (1987). Epidemiology in Medicine. Lippincott Williams & Wilkins. ISBN 978-0-316-35636-7. https://archive.org/details/epidemiologyinme00henn.

[30] Emanuel, Ezekiel J; Miller, Franklin G (Sep 20, 2001). "The Ethics of Placebo-Controlled Trials—A Middle Ground". New England Journal of Medicine 345 (12): 915–9. doi:10.1056/nejm200109203451211. PMID 11565527.

[31] Gustavson, K.; Torvik, F. A.; Davey Smith, G.; Røysamb, E.; Eilertsen, E. M. (2024). "Familial confounding or measurement error? How to interpret findings from sibling and co-twin control studies". European Journal of Epidemiology 39 (6): 587–603. doi:10.1007/s10654-024-01132-6. PMID 38879863.

[32] Madrid-Valero, J. J.; Verhulst, B.; López-López, J. A.; Ordoñana, J. R. (2024). "Calculating Within-Pair Difference Scores in the Co-twin Control Design. Effects of Alternative Strategies". Behavior Genetics 54 (5): 426–435. doi:10.1007/s10519-024-10196-9. PMID 39177736.

[C&S2006-33] Campbell, D. T.; Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNally.

[C&B2002-34] Crano, W. D.; Brewer, M. B. (2002). Principles and methods of social research (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. p. 28.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

v t e Design of experiments
Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size
Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable
Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison
Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test
Glossary Category Statistical outline Statistical topics

Anonymous

Search

Confounding

Namespaces

More

Page actions

Contents

Definition

Examples

Control

History

Types

Decreasing the potential for confounding

Criticism

Artifacts

See also

References

Further reading

External links

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Confounding

Definition

Examples

Control

History

Types

Decreasing the potential for confounding

Criticism

Artifacts

See also

References

Further reading

External links

Navigation

Wiki tools

Page tools

Other projects

Categories