Power and sample size calculations for a cluster randomized trial

CRTpower carries out power and sample size calculations for cluster randomized trials.

Usage

CRTpower(
  trial = NULL,
  locations = NULL,
  alpha = 0.05,
  desiredPower = 0.8,
  effect = NULL,
  yC = NULL,
  outcome_type = "d",
  sigma2 = NULL,
  denominator = 1,
  N = 1,
  ICC = NULL,
  cv_percent = NULL,
  c = NULL,
  sd_h = 0,
  spillover_interval = 0,
  contaminate_pop_pr = 0,
  distance_distribution = "normal"
)

Arguments

trial: dataframe or 'CRTsp' object: optional list of locations
locations: numeric: total number of units available for randomization (required if trial is not specified)
alpha: numeric: confidence level
desiredPower: numeric: desired power
effect: numeric: required effect size
yC: numeric: baseline (control) value of outcome
outcome_type: character: with options - 'y': continuous; 'n': count; 'e': event rate; 'p': proportion; 'd': dichotomous.
sigma2: numeric: variance of the outcome (required for outcome_type = 'y')
denominator: numeric: rate multiplier (for outcome_type = 'n' or outcome_type = 'e')
N: numeric: mean of the denominator for proportions (for outcome_type = 'p')
ICC: numeric: Intra-cluster correlation
cv_percent: numeric: Coefficient of variation of the outcome (expressed as a percentage)
c: integer: number of clusters in each arm (required if trial is not specified)
sd_h: numeric: standard deviation of number of units per cluster (required if trial is not specified)
spillover_interval: numeric: 95% spillover interval (km)
contaminate_pop_pr: numeric: Proportion of the locations within the 95% spillover interval.
distance_distribution: numeric: algorithm for computing distribution of spillover, with options - 'empirical': empirical distribution; 'normal': normal distribution.

Value

A list of class 'CRTsp' object comprising the input data, cluster and arm assignments, trial description and results of power calculations

Details

Power and sample size calculations are for an unmatched two-arm trial. For counts or event rate data the formula of Hayes & Bennett (1999), Int. J. Epi., 28(2) pp319–326 is used. This requires as an input the between cluster coefficient of variation (cv_percent). For continuous outcomes and proportions the formulae of Hemming et al, 2011 are used. These make use of the intra-cluster correlation in the outcome (ICC) as an input. If the coefficient of variation and not the ICC is supplied then the intra-cluster correlation is computed from the coefficient of variation using the formulae from Hayes & Moulton. If incompatible values for ICC and cv_percent are supplied then the value of the ICC is used.

The calculations do not consider any loss in power due to loss to follow-up and by default there is no adjustment for effects of spillover.

Spillover bias can be allowed for using a diffusion model of mosquito movement. If no location or arm assignment information is available then contaminate_pop_pr is used to parameterize the model using a normal approximation for the distribution of distance to discordant locations.

If a trial data frame or 'CRTsp' object is input then this is used to determine the number of locations. If this input object contains cluster assignments then the numbers and sizes of clusters in the input data are used to estimate the power. If spillover_interval > 0 and distance_distribution = 'empirical' then effects of spillover are incorporated into the power calculations based on the empirical distribution of distances to the nearest discordant location. (If distance_distribution is not equal to 'empirical' then the distribution of distances is assumed to be normal.

If geolocations are not input then power and sample size calculations are based on the scalar input parameters.

If buffer zones have been specified in the 'CRTsp' object then separate calculations are made for the core area and for the full site.

The output is an object of class 'CRTsp' containing any input trial data frame and values for:

The required numbers of clusters to achieve the specified power.
The design effect based on the input ICC.
Calculations of the power ignoring any bias caused by loss to follow-up etc.
Calculations of delta, the expected spillover bias.

Examples

{# Power calculations for a binary outcome without input geolocations
examplePower1 <- CRTpower(locations = 3000, ICC = 0.10, effect = 0.4, alpha = 0.05,
    outcome_type = 'd', desiredPower = 0.8, yC=0.35, c = 20, sd_h = 5)
summary(examplePower1)
# Power calculations for a rate outcome without input geolocations
examplePower2 <- CRTpower(locations = 2000, cv_percent = 40, effect = 0.4, denominator = 2.5,
    alpha = 0.05, outcome_type = 'e', desiredPower = 0.8, yC = 0.35, c = 20, sd_h=5)
summary(examplePower2)
# Example with input geolocations
examplePower3 <- CRTpower(trial = readdata('example_site.csv'), desiredPower = 0.8,
    effect=0.4, yC=0.35, outcome_type = 'd', ICC = 0.05, c = 20)
summary(examplePower3)
# Example with input geolocations, randomisation, and spillover
example4 <- randomizeCRT(specify_clusters(trial = readdata('example_site.csv'), c = 20))
examplePower4 <- CRTpower(trial = example4, desiredPower = 0.8,
    effect=0.4, yC=0.35, outcome_type = 'd', ICC = 0.05, contaminate_pop_pr = 0.3)
summary(examplePower4)
}
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#> Locations and Clusters
#> ----------------------                                          -            
#> Coordinate system                      No coordinates in dataset            
#>                          
#> Not aggregated. Total records: 3000. Unique locations:                        
#> Available clusters (across both arms)                           40            
#>   Per cluster mean number of points                             75            
#>   Per cluster s.d. number of points                             5            
#>                             
#> No locations to randomize          -            
#> 
#> Specification of Requirements
#> -----------------------------          -            
#> Significance level (2-sided):              0.05            
#> Type of Outcome                             dichotomous            
#> Expected outcome in control arm:            0.35            
#> Required effect size:                       0.4            
#> Intra-cluster correlation:                  0.1            
#> 
#> Power calculations
#> ------------------                              -            
#> Design effect:                                   8.4            
#> Calculations ignoring spillover                        
#> Nominal power (%)                                84.5            
#> Total clusters required (power of 80%):          38            
#> Sufficient clusters for required power?          Yes            
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#> Locations and Clusters
#> ----------------------                                          -            
#> Coordinate system                      No coordinates in dataset            
#>                          
#> Not aggregated. Total records: 2000. Unique locations:                        
#> Available clusters (across both arms)                           40            
#>   Per cluster mean number of points                             50            
#>   Per cluster s.d. number of points                             5            
#>                             
#> No locations to randomize          -            
#> 
#> Specification of Requirements
#> -----------------------------          -            
#> Significance level (2-sided):              0.05            
#> Type of Outcome                             event rate            
#> Expected outcome in control arm:            0.35            
#> Mean rate multiplier:                       2.5            
#> Required effect size:                       0.4            
#> Coefficient of variation (%):               40            
#> 
#> Power calculations
#> ------------------                              -            
#> Design effect:                                   7.8            
#> Calculations ignoring spillover                        
#> Nominal power (%)                                93.3            
#> Total clusters required (power of 80%):          28            
#> Sufficient clusters for required power?          Yes            
#> *** Assuming all clusters are the same size ***
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#> 
#> Summary of coordinates
#> ----------------------
#>         Min.   : 1st Qu.: Median : Mean   : 3rd Qu.: Max.   :
#>       x -3.20    -1.31    -0.24     0.00     1.35     5.16   
#>       y -5.08    -2.84    -0.17     0.00     2.49     6.16   
#> 
#> Total area (within  0.2 km of a location) :  27.6 sq.km
#> Total area (convex hull) :  48.2 sq.km
#> 
#> Locations and Clusters
#> ----------------------                                          -            
#> Coordinate system                      (x, y)            
#>                          
#> Not aggregated. Total records: 3172. Unique locations:          1181            
#> Available clusters (across both arms)                           40            
#>   Per cluster mean number of points                             79.3            
#>   Per cluster s.d. number of points                             0            
#> No randomization          -            
#> 
#> Specification of Requirements
#> -----------------------------          -            
#> Significance level (2-sided):              0.05            
#> Type of Outcome                             dichotomous            
#> Expected outcome in control arm:            0.35            
#> Required effect size:                       0.4            
#> Intra-cluster correlation:                  0.05            
#> 
#> Power calculations
#> ------------------                              -            
#> Design effect:                                   4.9            
#> Calculations ignoring spillover                        
#> Nominal power (%)                                98            
#> Total clusters required (power of 80%):          20            
#> Sufficient clusters for required power?          Yes            
#> 
#> Other variables in dataset
#> --------------------------          RDT_test_result            
#> *** computed distance to nearest measurements in discordant arm ***
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#> 
#> Summary of coordinates
#> ----------------------
#>                Min.   : 1st Qu.: Median : Mean   : 3rd Qu.: Max.   :
#>       x        -3.20    -1.31    -0.24     0.00     1.35     5.16   
#>       y        -5.08    -2.84    -0.17     0.00     2.49     6.16   
#> nearestDiscord -1.15    -0.33     0.00    -0.00     0.33     1.55   
#> 
#> Total area (within  0.2 km of a location) :  27.6 sq.km
#> Total area (convex hull) :  48.2 sq.km
#> 
#> Locations and Clusters
#> ----------------------                                          -            
#> Coordinate system                      (x, y)            
#>                          
#> Not aggregated. Total records: 3172. Unique locations:          1181            
#> Available clusters (across both arms)                           40            
#>   Per cluster mean number of points                             79.3            
#>   Per cluster s.d. number of points                             4.4            
#> Cluster randomization:                      Independently randomized            
#> 
#> Specification of Requirements
#> -----------------------------          -            
#> Significance level (2-sided):              0.05            
#> Type of Outcome                             dichotomous            
#> Expected outcome in control arm:            0.35            
#> Required effect size:                       0.4            
#> Intra-cluster correlation:                  0.05            
#> 
#> Power calculations
#> ------------------                              -            
#> Design effect:                                   4.9            
#> Spillover affecting 30% of data,
#>   normal model gives bias estimate:          -0.041            
#> Nominal power (%)                                93.8            
#> Total clusters required (power of 80%):          28            
#> Sufficient clusters for required power?          Yes            
#> 
#> Other variables in dataset
#> --------------------------          RDT_test_result