Meta-analysis of prospective studies, randomized trials. Outcome Variable: Continuous (effect size = Cohen d, Hedges g*, Glass Delta)

The meta-analysis can be done using following models.

1. Fixed effect model
2. Random effects model

Selection of appropriate model depends upon the heterogeneity amongst the included studies. (see below for details)

Fixed effect meta-analysis: Steps

1. Calculating / extracting effect size:

(There is remarkable inconsistency in formula used for effect sizes in available literature. Users are advised to look at the following formulae used by this software to calculate effect sizes and their confidence intervals)

A. Cohen's d (Hedges' g or Standardized Mean Difference : SMD)

Cohen's d = (M₁ - M₂) / SD_Pooled
Where, M₁ and M₂ are means of outcome variable in treated and control group, respectively.
N₁ and N₂ are sample sizes in treated and control group, respectively.
SD_Pooled = sqrt[(N₁-1) * SD₁+ (N₂-1) * SD₂] / (N₁+N₂- 2))
SD₁ and SD₂ are standard deviations of outcome variable in treated and control group, respectively.

B. Hedges' g* (Bias corrected)

Hedges' g* = Hedges' g x (Correction factor)
Where,
Correction factor = Exp[lgamma(df/2) - log (sqrt(df/2)) - lgamma((df-1)/2)]
Exp= exponent
lgamma = log gamma function
df= N₁ + N₂ - 2

This is extremely difficult to calculate. (This software uses above exact method to calculate Hedges g*)
However, following approximate method can also be used to calculate Hedges g*.

g = uncorrected Hedges g

C. Glass's Delta (Glass's Δ)

Glass's Delta = (M₁ - M₂) / SD_Control

2. Calculating Variance (V) for selected effect size

Variance (V) for Cohen's d = 1/N₁ + 1/N₂ + d² / (2* N₁ + 2 * N₂)
Variance (V) for Hedges' g* = 1/N₁ + 1/N₂ + g*^{^2} / (2 * N₁ + 2 * N₂)
Variance (V) for Glass's Delta (Δ) = 1/N₁ + 1/N₂ + Δ² / (2 * N₂)

SE = sqrt(V)

3. Weights (W) of each included studies under fixed effect model (inverse variance method)

W_i = 1/V_i

4. Summary effect size (MA_ES) under fixed effect model is calculated as follows

M_ES = Σ W_i * Y _i / Σ W_i
W_i= Weight for i^th included study
Y_i= Effect Size for i^th included study

5. Standard Error of MA_ES = 1/ ΣW_i

Once summary ES and its standard error is calculated, it is fairly easy to calculate confidence intervals of Summary ES.

6. Z value and p value.

Once summary ES and its standard error is calculated, it is fairly easy to calculate Z value and p value using normal distribution.
Z = MA_ES / SE

If available data is Effect Size (Cohen's d, Hedges' g, Glass's Delta or Hedges g*) and its confidence interval, then SE of the ES is calculated from UB and LB of confidence intervals, as follows.

SE = (UB - ES) / Z_1−α/2
It can be also calculated as follows
SE = (ES - LB) / Z_1−α/2
Variance (V) = SE²

Measures of heterogeneity

Not all included studies are homogeneous with respect to characteristics of participants such as age, gender, geo-environmental factors, other socio-demographic factors, selection criteria for cases etc. When these differences are present, variability in the study results is not just random, and the studies are considered as heterogeneous. This heterogeneity is quantified by following measures of heterogeneity.

1. Cochran's Q statistics

Q = Σ W_i * (Y_i - M)²
Y_i = effect size for i^th study
M = Summary ES (MA_ES)

Q statistics follows chi-square distribution with k - 1 degrees of freedom (k= number of included studies)
So, p value for Cochran's Q statistics can easily be calculated using chi square distribution.
Significant p value indicates that significant heterogeneity is present, and we should consider other methods of meta-analysis like random effects model or sub-group analysis or meta-regression.

2. Tau squared (τ²) statistics

τ²= (Q - (k -1)) / C
where,
C = Σ W_i - Σ W_i ² / Σ W_i
Σ W_i = Sum total of weights of all included studies
Σ W_i² = Sum total of squared weights of all included studies

3. I² statistics

I²= 100 * (Q - (k -1)) / Q
I² statistics is measured in percentage. I₂ value of < 25% is considered as low heterogeneity, between 25 and 50 it is moderate and above 50% it is significant heterogeneity.

All above measures of heterogeneity provides a quantified measure, but they can not identify the factors causing heterogeneity.

Random effects model

If fixed effect meta-analysis shows significant heterogeneity (by Cochran's Q or I² as explained above), then random effects model is one way to deal with heterogeneity.
In random effects model (DerSimonian‐Laird), weights of each study are revised as follows .
W_RE = 1/ (V + τ²)

There are other methods used for random effects model such as maximum likelihood, restricted maximum likelihood (REML), Paule‐Mandel, Knapp‐Hartung etc. However, DerSimonian‐Laird method is most widely used and robust.

Then, summary ES under random effects model is calculated as follows

MA_ES = Σ W_RE.i * Y _i / Σ W_RE.i

MA_ES = Summary Effect Size
W_RE.i= Revised weight for i^th included study
Y_i= Effect Size for i^th included study

Standard Error of MA_ES = 1/Σ W_RE.i

Once summary ES and its standard error is calculated, it is fairly easy to calculate confidence intervals of ES.

Z value and p value.

Once summary ES and its standard error is calculated, it is fairly easy to calculate Z value and p value using normal distribution.
Z = MA_ES / SE

Prediction interval is calculated using following formula.

Where, t_{1-α/2, k-2}is (1−α/2)% percentile of t distribution with significance level α and k−2 degrees of freedom, (k = number of studies included in the meta-analysis).

Interpretation: There is a 95% (or any other specified) probability that a newly conducted study will have effect size between this interval.

Galbraith Plot

The Galbraith plot can also provide graphical representation to assess heterogeneity in the included studies.
It is a scatter plot of Z value against precision (1/SE) of each included studies. (Z value of each study = ES / SE).
The central horizontal blue line represents the line of null effect. Studies above this line has positive effect size, indicating Group 1 mean is more than Group 2 mean. Studies below this line has negative effect size, indicating Group 1 mean is less than Group 2 mean.
Middle red line represents the MA_ES. (Its slope is equal to the MA _ES). Studies above this line has more effect size than summary ES. Studies below this line has less effect size than summary ES.
Two green line (above and below middle red line), represents the confidence intervals of MA_ES.
In the absence of significant heterogeneity, we expect that about 95% (or other specified level) studies lie between the two green lines, and 5% (or other specified level) lie outside this. If more number of studies are outside these lines, then it indicates significant heterogeneity.
In this Galbraith plot, three studies are above the top green line and one study is below bottom green line. So a total of 4 studies, out of 15 included studies are not within the zone bounded by two green lines, suggesting heterogeneity.

Forest Plot

The forest plot represents effect size of each included study with its confidence interval. It provides collective and comprehensive graphical view of all included studies as well as Meta-analysis summary ES.
Each horizontal line represents the confidence interval of the corresponding study. Square at the centre on each line represents the ES reported by the study. The thickness of the square is proportional to the ES.
Vertical line corresponding to ES=0, represents line of null effect. If horizontal line for any included study does not cross this vertical line, it means the corresponding study has reported significant ES.

Forest plot also provides degree of overlap between included studies (to judge heterogeneity) and precision (as judged by the length of confidence interval) of each included study.
Bottom horizontal line with central diamond represents MA_ES and its confidence interval.
For random effects model, additional horizontal line is shown, which represents the prediction interval. It can be used to predict the results of newly conducted study. This interval tells us that about 95% (or other specified level) newly conducted studies will show effect size within this interval.

It is a common tendency of publishers to publish "significant" studies. In other words, if result of a research is statistically non-significant, then such study can miss publication (file drawer problem). Meta-analysis includes mainly published studies, so it is often biased towards positive results. This bias is called as publication bias. The publication bias can be assessed by asymmetry of funnel plot, Begg-Mazumdar rank correlation test, Egger's regression test, Fail-Safe N, Duval and Tweedie trim and fill method.

Another method to assess publication bias is to calculate Fail-Safe N. We know that, publication bias is a result of non-publication of certain studies, mainly due to non-significant studies. It tends to bias the results positively or away from null effect. If we could find all such "missing" studies and add to the meta-analysis, then our meta-analysis or summary effect size will be reduced, may be to the extent that it is no more significant. But finding such "missing" or "unpublished" studies is practically not possible. Alternately, we can calculate number of additional studies with average null effect required to be added to the meta-analysis to change the summary effect size to non-significant level. If such number of "missing" or "unpublished" studies, with average null effect size, required to change the p value to non-significant level, is small, then publication bias is considered. Here, we can infer that with addition of small number of possible missing or unpublished studies , the summary effect size will become non-significant. On the other hand, if such number of "missing" or "unpublished" studies are large, then safely we can conclude that there is no serious problem of publication bias. (To decide whether Fail-Safe N is small or large, it should be judiciously compared with number of studies available in the literature / included in the meta-analysis)
Classical Fail-Safe N (described by Rosenthal)
It is equal to the number of such "missing" or "unpublished" studies, with average null effect size, required to be added to the meta-analysis, to change the summary effect size to non-significant level.
In this example, the p value provided by meta-analysis is < 0.001. We will need additional 1042 studies, with average null effect size, to bring the p value above 0.05 (two tailed).
Orwin's Fail-Safe N
It is an alternative to classical Fail-Safe N. Classical Fail-Safe N gives number of studies required to bring down the results to non-significant level. Orwin's Fail-Safe N gives number of additional studies, required to bring down the summary effect size to a specified level (trivial criteria). Additionally, we can specify the average effect size of missing studies(M).
In this example, the summary ES was -1.2349. If trivial ES value is considered as -0.1, and average ES in "missing" or "unpublished" studies is taken as -0.05, then we will require 341 more such studies to be added to meta-analysis to bring down the summary ES to -0.1. (You can change these criteria and re-calculate the Orwin's Fail-Safe N. Needless to say that, the (unsigned) trivial effect size criteria must be less than the (unsigned) MA summary effect size and (unsigned)average effect size must be less than trivial criteria.)

While, all above methods of detection of publication bias can suggest or detect it, none of them gives the actual impact of probable publication bias. Trim and fill method actually detects the publication bias by detecting asymmetry of funnel plot, identifies studies that are causing asymmetry, trims (removes) those studies and finally adds these trimmed studies along with hypothetical studies which are "mirror" images of trimmed studies. Mirror image study means a hypothetical study towards the opposite side, at equivalent distance from the central MA summary measure. This will make the funnel plot symmetrical. This "fill" of mirror images is based on the consideration that these are the probable studies which are missed publication. That is why this method is called as "Trim and Fill" method. Finally, after filling of probable missed studies, meta-analysis is done again.

This revised estimate of meta-analysis, after adding hypothetical and probable missed studies, will give us the summary measure equal to what we would have got, had there was no publication bias. If the difference between original meta-analysis and after "Trim and Fill" is large, we can conclude that the impact of publication bias is significant. If the difference is minimal, we can conclude that the impact is minimal.
We will learn this method using the same funnel plot given above. After careful inspection of the funnel plot, it can be detected that there is one "extra" study on left side at bottom (with large SE). Corresponding study is missing on right side. (There is another "extra" study on right side, but it is not at bottom.) But, this method has subjective variations. So, let's use Trim and Fill method.

It can be seen from the results of the Trimand Fill that, there are 4 "missing" studies on right side, (as against one study identified on inspection), which is causing funnel plot asymmetry. So, the method has added 4 hypothetical studies (red dots) on right side, which are mirror image of "identified extra studies on left side". After adding this hypothetical study, meta-analysis is revised, and revised estimates of summary measures are provided in the table. Original estimates of ES were -1.235 (-1.385 to - 1.085), and revised estimates are -1.003 (-1.003 to -0.869). This change in overall estimates is minimal.

Final remarks on publication bias:

In above example, Funnel plot has suggested the possibility of publication bias. Begg Mazumdar test and Egger's regression test have shown significant bias. Fail-Safe N value is 1042 and 341. Trim and Fill has identified publication bias and has given its possible impact. Now, the question is how to interpret these findings? Here, we should compare Fail-Safe N with number of available studies in meta-analysis(1042 or 341 versus 15). When, 15 studies are detected by our thorough search, is it possible to have another > 300 "missing" studies? The answer is, not likely! The required number > 300 is huge as compared with available 15 studies. Secondly, "Trim and Fill" has shown 4 extra studies, but even after adding hypothetical studies to make funnel plot symmetrical, there is no substantial change, quantitatively or qualitatively, in the final results of the meta-analysis. So, safely we can conclude that though there is evidence of publication bias, its impact is too low to cause any qualitative or substantial quantitative change in the results of the meta-analysis.

Begg and Mazumdar rank correlation
Kendall's S Statistics	-37
Kendall's tau without continuity correction (* significant p value indicates significant publication bias)
Kendall's Tau	-0.352
Z	-1.831
p (one tailed)	0.034
Kendall's tau with continuity correction (* significant p value indicates significant publication bias)
Kendall's Tau	-0.343
Z	-1.782
p (one tailed)	0.037

Egger's regression test (* significant p value indicates significant publication bias)
Intercept	-13.403
Standard error of intercept	5.552
t value	-2.414
df	13
p (one tailed)	0.0156
LB of Confidence interval	-25.3968
UB of confidence interval	-1.4091

Fail-Safe N (Rosenthal)
Z value for observed studies	-16.094
P value for observed studies (two tailed)	0
Alpha	0.05
Z for given alpha (one tailed)	1.6449
Z for given alpha (two tailed)	1.96
Fail-Safe N (One tailed)	1486
Fail-Safe N (Two tailed)	1042

Orwin's Fail-Safe N
Effect Size in meta-analysis	Effect Size = -1.2349
Criteria for trivial effect Size
Average effect size for missing studies
Orwin's Fail-Safe N	341

Duval and Tweedie's Trim and Fill
	Number of studies trimmed and filled	Effect Size	LB	UB
Meta-analysis Results		-1.235	-1.385	-1.085
Left sided missing studies adjusted	0	-1.2349	-1.3853	-1.0845
Right sided missing studies adjusted	4	-1.003	-1.138	-0.869