Professor Paul Burton

Paul is Professor of Data Science for Health in the Institute of Health and Society, Newcastle University. Over his career his research has encompassed three broad themes: 1) Methods research in biostatistics and genetic epidemiology; focusing particularly on generalized linear models (GLMs), generalized linear mixed models (GLMMs) and on Bayesian approaches to modelling and statistical inference, parti cularly when data exhibit a strong correlation structure; 2) Applied research in genetic epidemiology and complex disease epidemiology; and 3) Health Data Science with a focus on the design, set up and harmonization of biobanks/cohort studies, and in facilitating research access to national and international repositories of data and biosamples in ways that fully respect governance constraints and yet make the necessary procedures fast and efficient.

Within the D2K Research Group, which he co-leads with Madeleine Murtagh (PEALS Centre, Newcastle University) Paul now focuses primarily on Health Data Science with a particular interest in the management and exploitation of ‘BIG’ and ‘complex’ data. How we rise to challenges in these domains will determine whether health science benefits as effectively as it might from the ‘data revolution’. Full advantage must therefore be taken of rapid developments in the methods and technology underpinning the generation, integration, and interpretation of data, and the translation of evidence into policy and clinical practice. In D2K it is believed that most of the key challenges in these domains require solutions that combine technical innovation with comprehensive understanding of, and rigorous accountability to, societal needs and perspectives. A transdisciplinary approach is therefore essential and D2K comprises a diverse team spanning infrastructural informatics through to social science and ethnography via epidemiology and biostatistics.

Paul’s current research program encompasses biostatistics, epidemiology and informatics - including their associated social and ethico-legal dimensions. It focuses on the development, evaluation and application of systems and tools to facilitate: 1) streamlined well-governanced access to health-related data (whether they be intended for frontline care, health care planning or for research); and 2) privacy-protected co-analysis – including federated co-analysis - of appropriately harmonized data from multiple sources. He views his most innovative work as centring around two major projects:

  • DataSHIELD ( addresses the commonly encountered need in the biomedical and social sciences to analyse/co-analyse individual patient data (microdata) without physically sharing those data. Paul leads the DataSHIELD project as a whole with lead collaborators in McGill University, Montreal (Vincent Ferretti and Yannick Marcon), and the MRC Epidemiology Unit, Cambridge (Tom Bishop).

    Connected Health Cities (CHC). There are four CHC projects across the North of England. Each CHC region aims to unite health and social care services by sharing and linking data in order to improve the health of local people. Paul is Principal Investigator (PI) for CHC (NENC) – North East and North Cumbria – which is based in Newcastle. The project team is led by Joe McDonald, Nick Booth and Mark Walsh. The overall CHC program is led by John Ainsworth, University of Manchester. 

Paul is also PI of 58-FORWARDS, a grant jointly awarded by MRC and Wellcome Trust (WT) that funds maintenance and enhancement of the infrastructure underpinning access to data and biosamples from the Biomedical Resource of the 1958 Birth Cohort.

Click here for a full staff profile.