Subgroup Analysis SIG


Subgroup analysis is routinely conducted in drug development, in various settings; one key aspect is the regulatory requirement to demonstrate consistency of treatment effect across a pre-defined set of subgroups (e.g., ICHE5, E9, E17). This is performed as a risk-benefit assessment, aiming to identify the right patient population to treat - and, here, that set of subgroups is agreed with regulators prior to the trial conduct. Another key aspect is subgroup selection, where the aim is to estimate the effect in the most promising subpopulation (typically for planning another trial). The latter can either be done with respect to the same fixed set of pre-specified subgroups as mentioned earlier, or in a data driven fashion (e.g., biomarker subgroup detection).

There are well-known inherent statistical difficulties with all the above; with consistency, due to limited data in the subgroups, there is a high risk of false positives (random highs) as well as a low power to detect true differential effects (since trials are seldomly sized for it). In the subgroup selection setting, it is of key importance to provide an honest estimate discounted for the number of subgroups inspected, in order to not overstate the real effect. Even in a consistency assessment setting, there might well be a certain tendency to focus on the most deviating subgroup results, hence possibly introducing a bias although not formally a 'selection' problem.

The PSI Subgroup SIG is devoted to methodologies and aspects around these questions, with a focus on questions related to assessment of a fixed, pre-specified, set of subgroups. As such, the SIG has not yet explored the application to data-driven biomarker subgroup detection area (data mining), although some of the approaches investigated could be used for this purpose.

The work aims at providing as much guidance and clarity as possible on the inherent issues and possible approaches to analysis and subsequent detailed investigation of pre-specified subgroups in order to provide context for any subgroup findings.

Numerous methods have been suggested in the literature and ranges from interaction testing, permutation-based ordered statistics, Bayesian shrinkage, Bootstrap bias reduction, model averaging and graphical methods. One inherent difficulty, making the analysis less straightforward than it might first appear, is the fact that many subgroups will overlap. Also, non-trivial aspects arise from some subgroup factors being prognostic. 

The PSI Subgroup SIG submitted a White Paper in May 2018 on some of these aspects, containing an overview of the inherent problems, recommendations for the planning stage, a novel permutation based approach for assessing expected deviations under a null assumption, and some simulation based conclusions where various methods were compared.

Due to the complexity not all available methods were initially studied (e.g., the Bayesian ones) and further work is being conducted. The aim is to provide an updated document later when these methods have been developed and evaluated.

Who we are:

Björn Bornkamp, Aaron Dane, Christine Fletcher, Ilia Lipkovich, Henrik Loft, Brian Millen, Heiko Goette, Necdet Gunsoy, Tom Parke, Arne Ring, Gerd Rosenkranz, Amy Spencer and David Svensson.

The SIG is currently lead by David Svensson. 
(Up to May 2018, the lead was Aaron Dane).

How to get in touch


Experimental Nested Shrinkage Approach to Multi-regional data


  • 2018 APRIL: Update re progress of White Paper, remaining work for 2018. Ideas include further work on Bayesian shrinkage, Model averaging, Simulations under NULL when prognostic factors are present, SEAMOS development for non-linear models and Bootstrap Bias reduction.
  • 2018 JUNE:  David PSI presentation on some aspects of Shrinkage, multi-level hierarchical models, and model averaging. Key aspect: many variations exist, and some unknowns regarding performance.
  • 2018 JULY: Simulation of RCT discussed with underlying prognostic predictive continouos variables - dichotomized into subgroup factors - some preliminary illustrations of methods listed under APRIL. 
  • 2018 SEPT: Further simulations on the performance of BIC model averaging. 
  • 2018 NOV: Visualisation using novel R package SubrPlots (e.g., UpSet graph), Amy Spencer presenting further work on SEAMOS (modification to increase power). Updates on PSI 2019 Subgroup Section
  • 2019 JAN: Discussing content for SIG subgroup session at PSI, presentation by David Svensson on SEAMOS (some simulation results in a non-linear case with prognostic factors). 


Latest News

Show all news

EventsFuture Events

  • Webinar: MCP-Mod – Theory, Implementation and Extensions - Dates: 08 – 08 May, 2019

    MCP-Mod (Multiple Comparisons & Modelling) is a popular statistical methodology for model-based design and analysis of dose finding studies. This webinar will describe the theory behind MCP-Mod (plus extensions), and how to implement it within available software. Pantelis Vlachos (Cytel) will provide a brief introduction to the methodology and illustrate the MCP-MoD capabilities in EAST 6.5. Saswati Saha (University of Brehem) will discuss new variations and alternatives to MCP-Mod and show how to implement them in R. Neal Thomas (Pfizer) will present further technical details of MCP-Mod by evaluating the method using results from least squares linear model theory.
  • PSI Toxicology SIG Workshop 2019 - Dates: 02 – 03 Apr, 2019

    This 1.5-day workshop will involve approximately 20 statisticians, focusing on discussions around “best practice” in the statistical analysis of various data types.​