위키정의:
In probability and statistics, Simpson's paradox (or the Yule-Simpson effect) is an apparent paradox in which a correlation (trend) present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequency data are hastily given causal interpretations. Simpson's Paradox disappears when any causal relations are derived systematically – i.e. through formal analysis.
예를 보면 쉽다.
1)
데이비드는 1995, 1996년 모두 데릭보다 성적이 좋지만 2년의 결과를 합치면 좋지 않다.
2) 신장결석에 대한 두 가지 치료법에 대한 비교.
전체 성공률만 보면, 치료법 B가 좋아보인다.
하지만 나누어 보면, 결과가 너무 달라진다.
At best, Simpson's Paradox is used to argue that association is not causation.
At worst, Simpson's Paradox is used to argue that induction is impossible in observational studies.
참고.
http://en.wikipedia.org/wiki/Simpson's_paradox
web.augsburg.edu/~schield/MiloPapers/99ASA.pdf
티스토리를 오랜만에 사용하는데..
글쓰기가 너무 불편.. ;;
In probability and statistics, Simpson's paradox (or the Yule-Simpson effect) is an apparent paradox in which a correlation (trend) present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequency data are hastily given causal interpretations. Simpson's Paradox disappears when any causal relations are derived systematically – i.e. through formal analysis.
예를 보면 쉽다.
1)
데이비드는 1995, 1996년 모두 데릭보다 성적이 좋지만 2년의 결과를 합치면 좋지 않다.
1995 | 1996 | Combined | |
Derek Jeter | 12/48 .250 | 183/582 .314 | 195/630 .310 |
David Justice | 104/411 .253 | 45/140 .321 | 149/551 .270 |
2) 신장결석에 대한 두 가지 치료법에 대한 비교.
Treatment A | Treatment B | |
성공률 | 78% (273/350) | 83% (289/350) |
전체 성공률만 보면, 치료법 B가 좋아보인다.
하지만 나누어 보면, 결과가 너무 달라진다.
Treatment A | Treatment B | |
---|---|---|
Small Stones | 93% (81/87) | 87% (234/270) |
Large Stones | 73% (192/263) | 69% (55/80) |
Both | 78% (273/350) | 83% (289/350) |
At best, Simpson's Paradox is used to argue that association is not causation.
At worst, Simpson's Paradox is used to argue that induction is impossible in observational studies.
참고.
http://en.wikipedia.org/wiki/Simpson's_paradox
web.augsburg.edu/~schield/MiloPapers/99ASA.pdf
티스토리를 오랜만에 사용하는데..
글쓰기가 너무 불편.. ;;
'Mining' 카테고리의 다른 글
R - Special Values (0) | 2011.04.14 |
---|---|
R - Import data (SAS to R, DB to R) (0) | 2011.04.08 |
R - 데이터 타입 (Data Types) (0) | 2011.04.07 |
인과관계, 상관관계 (causality, correlation) (0) | 2010.12.12 |
The square root sampling relationship (0) | 2010.08.04 |