Example:
A researcher conducted a study to predict the marks obtained in final examination by students. His predictor (independent) variables are physical attendance in percentage, score in pre-final examination, gender and score in a separate IQ test. The study has revealed R2 = 0.24. Then he wanted to add two more predictors;
mother's education and father's education. A pilot study has revealed that, after adding these two new predictors, R2 was 0.40 (an addition of 0.16). What is the size of the sample required to achieve 80% power to detect significant increase in R2, after adding 2 independent variables, at confidence level of 95%?
Solution:
Here
Confidence level = 95%, power = 80%, Change in R2=0.16, k1=4, k2=2
After putting these values, we get required sample size = 54.
How sample size is calculated? (Exclusively for advanced users)
1. Calculation of sample size for Multiple Linear Regression, when additional predictors are added, requires a complex iterative approach.
Let K1 be the previous predictors, K2 be the additional apredictors, New number of predictors = K1 + K2 = KN
2. Initially iteration process is started at temporary predecided lowest sample size = KN + 2.
3. Starting with the lowest sample size (N) of KN + 2 , degrees of freedom for numerator (K2) and denominator (N-KN-1) are calculated. Then using given confidence level and these degrees of freedom, F critical value is calculated.
For example, if the number of previous predictors (K1) are 4, number of additional predictors = 2, then we will start with sample size of 4 + 2 +2 = 8. In this case, degrees of freedom for numerator (d1) will be k2 = 2. Degrees of freedom for denominator (d2) will be N – KN -1 = 1. Now F critical value (x) for 95% confidence level (alpha=0.05), 2 and 1 degrees of freedom is calculated using inverse F distribution function. (It is the F table value for given alpha and degrees of freedom). In this case,critical F value will be 199.5
4. Non centrality parameter λ (lambda) is calculated using following equation.
λ = f2 * N
. If effect size input is R2 or η2 or f, then f 2 is calculated as follows
f2 = R2/ (1 - R2)
f2 = η2/ (1 - η2)
f2 = f * f
5. Using the values of non-centrality parameter (λ), d1, d2 and x; power of the test is calculated using non central cumulative distribution function formula.
Where, I (q | a, c) is the regularized incomplete beta function.
6. If calculated power is less than desired, process is repeated by increasing the sample size by 1, till the desired power is achieved.
@ Sachin Mumbare