Speaking Notes
PADM 5324
November 10, 2009
Dr. Neubauer
In the News item:
http://www.cnn.com/2009/HEALTH/11/09/anti.nicotine.vaccine/index.html
By Val Willingham, CNN Medical Producer
November 9, 2009 9:15 a.m. EST
CHAPTER 15 -- More on Causal Inferences: Bias, Confounding, and Interaction
Chapter 14 was about the meaning of causation in epidemiology and the social sciences generally.
Both domains are VERY COMPLEX for similar reasons. The UNIT OF ANALSIS is in either case very complex and the ENVIRONMENT is also very complex.
The INTERACTION between two complex "things" is very complex.
The human brain/mind is not yet understood, although we try to program(educate) and reprogram (counsel) it.
BIAS means a tendency in one direction or another. A biased argument is a "slanted" argument. A car that wants to steer to the right or to the left can be said to be "biased."
In research, bias can be the product of WHO was asked to participate and of HOW the data they produce is INTERPRETED.
SELECTION BIAS
Non-response almost always introduces some bias into the findings. People who will cooperate and almost always "different" from people who refuse to participate IN SOME WAY THAT IS RELEVANT TO WHAT IT IS YOU ARE TRYING TO STUDY.
Very few studies have attempted to report the attitudes and characteristics of people who refuse to participate.
In Epidemiology, the bias may be relevant to EXPOSURES or to DISEASES, or both.
If some participants are placed into the wrong group, any apparent relationship between exposures and disease will be diluted.
If the intent is to INFER from a sample population to an actual population, a low RESPONSE RATE opens up the likelihood that those who participate are not an ideal sample of the population.
INFORMATION BIAS
It is only a source of bias if the mistaken information SYSTEMATICALLY skews the findings in one way or another.
If some participants are placed into the wrong group, any apparent relationship between exposures and disease will be diluted.
If a survey is used, the way the questions are worded (or asked by interviewers) may create a bias.
When people get some dreadful disease they often have the desire to believe that they in no way contributed to their getting it. This may affect how their answer questions, or even how they remember the past.
If the subject has died, relatives may be interviewed. They are likely to remember his or her exposures differently, or just flat out not know.
There may be SURVEILLANCE BIAS in that physicians are more likely to monitor their diseased patients more closely than other patients who become controls in a study.
REPORTING BIAS -- people's reluctance to report exposures that are illegal or otherwise not accepted in society. In very conservative communities women are likely not to report having had an elective abortion.
CONFOUNDING
http://en.wikipedia.org/wiki/Confounding
In statistics, a confounding variable (also confounding factor, lurking variable, a confound, or confounder) is an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable. The methodologies of scientific studies therefore need to control for these factors to avoid a type 1 error; an erroneous 'false positive' conclusion that the dependent variables are in a causal relationship with the independent variable. Such a relation between two observed variables is termed a spurious relationship. Thus, confounding is a major threat to the validity of inferences made about cause and effect, i.e. internal validity, as the observed effects should be attributed to the confounder rather than the independent variable.
In the social sciences, the two independent variables Education and Income are likely to be positively correlated and to both be related to the dependent variable. SO, HOW DO YOU TEST TO SEE IF INCOME (ALONE) IS RELATED TO THE DEPENDENT VARIABLE?
You CONTROL FOR EDUCATION BY DIVIDING THE SAMPLE into two or more groups based on education.
|
ENTIRE SAMPLE (n=800) |
|
Both education and income are positively correlated with life satisfaction. But is one of them "in the driver's seat and the other one "along for the ride?" If so which is the one that really matters and which one is the "confounder?" |
So, let us control for education and look to see if there is a relationship between income and life satisfaction.
|
TWO YEARS OF COLLEGE OR LESS (n=500) |
MORE THAN TWO YEARS OF COLLEGE (n=300) |
|
There is a strong positive association between income and life satisfaction. |
There is a strong positive association between income and life satisfaction. |
What we see above is that after controlling for years of education, THE RELATIONSHIP BETWEEN INCOME AND LIFE SATISFACTION is apparent in both (or all) the subdivisions of the control variable. It looks like what really matters in determining life satisfaction is income.
Now, let's control for income and test using education as the independent variable.
|
ANNUAL INCOME LESS THAN $40,000 A YEAR |
ANNUAL INCOME $40,000 A YEAR OR MORE |
|
There is only a weak association between years of education and life satisfaction among members of this group. |
There is only a weak association between years of education and life satisfaction among members of this group. |
We see that after controlling for income, the relationship between education and life satisfaction is weak. BOTTOM LINE, (according to this made-up data) INCOME IS IN THE DRIVER'S SEAT and EDUCATION is the confounding variable which is, "just along for the ride."
The example used in our textbook is that smoking and coffee drinking tend to be fairly highly correlated and both are related to instances of pancreatic cancer. But upon examination, it is the smoking that is in the driver's seat and coffee drinking that is "just along for the ride." I think we are assuming here that there is no INTERACTION between the CAUSAL VARIABLE (SMOKING) and the CONFOUNDING VARIABLE (coffee drinking).
THERE IS ANOTHER POSSIBILITY CALLED INTERACTION (page 256)
If the incidence of pancreatic cancer was higher among those who both smoke and drink coffee than among those who only smoke, then coffee drinking would be more than just a confounding variable "along for the ride." The book explains this using a series of tables regarding ATTRIBUTABLE RISKS.
I think the point is that if A and B both cause Y and if A and B TOGETHER are more likely than A or B along to cause Y, then the relationship between A and B is INTERACTION.
I think you could also have a CATALYTIC factor (A), that does not cause Y alone but INCREASES THE POTENTIAL OF "B" to cause Y. I don't know of an example of this. The notion of catalyst is common in chemistry.
The only example of INTERATION that I am aware of is exposure to asbestos and smoking as related to lung cancer. Either one of them can cause lung cancer. Together they effect is greater than the sum of the risks of the two of them. My guess is that the tobacco smoke tends to cause more of the asbestos dust to be drawn into the lungs.
SO, WHAT IS THE SIGIFICANCE OF ALL THIS.