Prognostic factors (PFs) are patient characteristics (e.g. age, biomarker levels) that are associated with future clinical outcomes in patients with a disease or health condition. Evidence-based PF results are paramount, for which individual patient data (IPD) meta-analysis is thought to be the ’gold-standard’ approach, as it synthesises the raw data across related studies (in contrast to an aggregate data meta-analysis, that just uses reported summary data).
In this Ph.D. thesis, I investigate statistical issues and develop methodological recommendations for individual patient data meta-analysis of prognostic factor studies (IMPF) projects. First, I investigate the benefits and limitations of IPD meta-analyses of PF studies through a systematic review and in-depth evaluation of existing IPD meta-analyses of PFs; 48 IMPF articles were found and an in-depth evaluation of a random sample of 20 IMPF articles was undertaken to identify how such projects are initiated, conducted, and reported, and to identify the benefits and challenges of the IPD approach. I found that although IMPF articles have many advantages, they still face a number of challenges and pitfalls such as different methods of measurements, ignoring clustering of patients across studies, missing data, and potential publication bias, unachieved linearity assumption of PFs, poor reporting, and potentially not protocol driven. To improve IMPF articles and projects guidelines were developed, and an array of methodological research questions identified.
Secondly, I undertook an empirical study to compare between the IPD and aggregated data approach to assess PFs in breast cancer. I showed that the IPD approach is preferable over aggregated data, as it allows one to adjust the PF by other confounding factors, examine PFs in subgroups of patients and assess the interaction between two PFs as an additional PF. It also allowed more studies and more patients to be included. However, the IPD approach still faced challenges, such as potential publication bias, missing data, and failed model assumptions in some studies.
Thirdly, I developed eleven IPD meta-analysis models to investigate whether accounting for clustering of patients within studies should be undertaken and which approach is the best to use. The models differed by using either a one-step or two-step approach, and whether they accounted for parameter correlation and residual variation. An IPD meta-analysis of 4 studies for age as a PF for 6 month mortality in traumatic brain injury was used as an applied example. Surprisingly, I found that there was no difference between the eleven models because there was little variation in baseline risk across studies. Thus, a simulation study was undertaken to examine which model is the best one-step or two-step, and whether accounting for the clustering of patients within studies is important. I found that the clustering across studies should be considered, and one-step model accounting for the clustering of patients within studies is the best fitted model as it yielded the lowest bias and the coverage was around 95%. Ignoring clustering can produce downward bias and too low a coverage; occasionally the two-step produces too high a coverage.
Fourthly, I examined the linearity assumption for the relation between age and risk of 6 months mortality in the traumatic brain injury dataset. I found that the linear trend was not the best in all studies. Thus, I developed three non-linear fractional polynomial IPD meta-analysis models based on whether one-step or two-step approach and whether first or second order fractional polynomial functions are performed. I found that one-step fractional polynomial meta-analysis model that account for the clustering of patients within studies is again the best fitted model, as it easier to fit and force the IPD studies to have the same polynomial powers. This revealed age has a quadratic relationship with mortality risk.
Fifth, I assessed whether small-study effects (i.e. potential publication bias) exists for 6 IPD prognostic factor articles by using different tools, such as contour funnel plot, cumulative meta-analysis, trim and fill method, and regression tests. I found the small-study effects problem is not a major concern, in contrast to aggregated data meta-analysis of PFs. Only in the breast cancer data of Look et al. was there substantial evidence for small-study effects. However, adjusted results to account for this provided a smaller PF effect but suggested the original conclusions are unlikely to change.
To sum up, this thesis highlights a number of challenges of IMPF projects and discusses possible approaches to dealing with some of them. However, numerous challenges remain for