eTheses Repository

Investigation of over-fitting and optimism in prognostic models

Richardson, Matthew (2010)
Ph.D. thesis, University of Birmingham.

PDF (3262Kb)


This work seeks to develop a high quality prognostic model for the CARE-HF data; see (Richardson et al. 2007). The CARE-HF trial was a major study into the effects of cardiac resynchronization. Cardiac resynchronization has been shown to reduce mortality in patients suffering heart failure due to electrical problems in the heart. The prognostic model presented in this work was motivated by the question as to which patient characteristics may modify the effect of cardiac resynchronization. This is a question of great importance to clinicians. Efforts are made to produce a high quality prognostic model in part through the application of methods to reduce the risk of over-fitting. One method discussed in this work is the strategy proposed by Frank Harrell Jr. The various aspects of Harrell’s approach are discussed. An attempt is made to extend Harrell’s strategy to frailty models. Key issues such as missing data and imputation, specification of the functional form of the model, and validation are examined in relation to the prognostic model for the CARE-HF data. Material is presented covering survival analysis, maximum likelihood methods, model selection criteria (AIC, BIC), specification of functional form (cubic splines and fractional polynomials) and validation methods (cross-validation, bootstrap methods). The concepts of over-fitting and optimism are examined. The author concludes that whilst Harrell’s strategy is valuable it is still quite possible to produce models that are over-fitted. MDL (Minimum Description Length) is suggested as potentially useful methods by which statistical models can be obtained that have an in built resistance to over-fitting. The author also recommends that concepts such as over-fitting, optimism and model validation are introduced earlier in more elementary courses on statistical modelling.

Type of Work:Ph.D. thesis.
Supervisor(s):Freemantle, Nick
School/Faculty:Colleges (2008 onwards) > College of Medical & Dental Sciences
Department:School of Health and Population Sciences
Subjects:R Medicine (General)
RC Internal medicine
Institution:University of Birmingham
ID Code:754
This unpublished thesis/dissertation is copyright of the author and/or third parties. The intellectual property rights of the author or third parties in respect of this work are as defined by The Copyright Designs and Patents Act 1988 or as modified by any successor legislation. Any use made of information contained in this thesis/dissertation must be in accordance with that legislation and must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the permission of the copyright holder.
Export Reference As : ASCII + BibTeX + Dublin Core + EndNote + HTML + METS + MODS + OpenURL Object + Reference Manager + Refer + RefWorks
Share this item :
QR Code for this page

Repository Staff Only: item control page