Aim: To Calculate Sample Size to estimate Odds Ratio (OR) with specified precision in case control study.

Formula Used

To calculate P1 or P2 or OR, when other two are known

P1 = OR * P2 / (1 - P2 + OR * P2)

P2 = P1 / (OR – OR * P1 + P1)

OR = P1 * (1 – P2) / (P2 * (1 – P1))

OR = Anticipated Odds Ratio

P1 = Estimated Proportion of Cases exposed to risk factor (out of 1) (e.g. 25% = 0.25)

P2 = Estimated Proportion of Controls exposed to risk factor (out of 1) (e.g. 15% = 0.15)

ε = Precision / Allowable error (Out of 1) (e.g. 25% = 0.25)

r= Controls to Cases ratio

Z 1-α/2 = the standard normal deviate corresponding the confidence level


Example:

A case control study is planned to estimate the OR within 25 % of its true value, associated with less attendance (risk factor) and failure in final examination (outcome). A pilot study has revealed that the proportion of students with less attendance in failures and pass students were 20% and 5%, respectively. How much sample size shall be required to estimate the Odds ratio within 25% of its true value, at 95% confidence level, and controls: Cases = 1?

Solution:

Here

P1 = 20%, P2=5%, Confidence level = 95%, precision = 25%

After putting these values, we get required sample size in each group = 1267.

We also get the estimated OR = 4.75.


What will happen after selecting this sample size, considering we get the results with anticipated exposure rates and OR?

If we take 1267 students in each group, we are anticipating our results as follows, considering exposure rates of 20% and 5% in failures (Cases) and Pass (Controls) students. (Actual data can not have fractions, but to maintain desired exposure rates, we have not converted them to integers.)

Failures Pass Total
Less Attendance (Exposed) 253.4 (a) 63.35 (b) 316.75
Adequate Attendance (Non exposed) 1013.6 (c) 1203.65(d) 2217.25
Total 1267 1267 2534

OR = 253.4 * 1203.65 / (63.35 * 1013.6) = 4.5

Loge OR = 1.56

SE of Log OR = SQRT (1/a + 1/b + 1/c + 1/d) = 0.15

95 % CI of Log OR = 1.56 ± 1.96 * 0.15 = 1.27 – 1.85

95% CI of OR = 3.5625 – 6.3339

We wanted Odds ratio within 25% of its true value with 95% confidence level. i. e. If OR = 4.75, then we wanted it not to deviate from both sides by 1.1875 (25% of 4.75), i. e. 3.5625 - 5.9375.

Please note that this sample size covers only one bound of the intended interval. The upper bound as per our intended precision is still out of the range. To include this bound, following formula can be used. However, this gives a large sample size, covering upper bound and increasing the precision at the lower bound. This is expected, as Log OR is normally distributed, not OR itself.


@ Sachin Mumbare