# Power and sample size calculations for a cluster randomized trial

`CRTpower.Rd`

`CRTpower`

carries out power and sample size calculations for cluster randomized trials.

## Usage

```
CRTpower(
trial = NULL,
locations = NULL,
alpha = 0.05,
desiredPower = 0.8,
effect = NULL,
yC = NULL,
outcome_type = "d",
sigma2 = NULL,
denominator = 1,
N = 1,
ICC = NULL,
cv_percent = NULL,
c = NULL,
sd_h = 0,
spillover_interval = 0,
contaminate_pop_pr = 0,
distance_distribution = "normal"
)
```

## Arguments

- trial
dataframe or

`'CRTsp'`

object: optional list of locations- locations
numeric: total number of units available for randomization (required if

`trial`

is not specified)- alpha
numeric: confidence level

- desiredPower
numeric: desired power

- effect
numeric: required effect size

- yC
numeric: baseline (control) value of outcome

- outcome_type
character: with options -

`'y'`

: continuous;`'n'`

: count;`'e'`

: event rate;`'p'`

: proportion;`'d'`

: dichotomous.- sigma2
numeric: variance of the outcome (required for

`outcome_type = 'y'`

)- denominator
numeric: rate multiplier (for

`outcome_type = 'n'`

or`outcome_type = 'e'`

)- N
numeric: mean of the denominator for proportions (for

`outcome_type = 'p'`

)- ICC
numeric: Intra-cluster correlation

- cv_percent
numeric: Coefficient of variation of the outcome (expressed as a percentage)

- c
integer: number of clusters in each arm (required if

`trial`

is not specified)- sd_h
numeric: standard deviation of number of units per cluster (required if

`trial`

is not specified)- spillover_interval
numeric: 95% spillover interval (km)

- contaminate_pop_pr
numeric: Proportion of the locations within the 95% spillover interval.

- distance_distribution
numeric: algorithm for computing distribution of spillover, with options -

`'empirical'`

: empirical distribution;`'normal'`

: normal distribution.

## Value

A list of class `'CRTsp'`

object comprising the input data, cluster and arm assignments,
trial description and results of power calculations

## Details

Power and sample size calculations are for an unmatched two-arm trial. For counts
or event rate data the formula of Hayes & Bennett, 1999 is used. This requires as an input the
between cluster coefficient of variation (`cv_percent`

). For continuous outcomes and proportions the formulae of
Hemming et al, 2011 are used. These make use of
the intra-cluster correlation in the outcome (`ICC`

) as an input. If the coefficient of variation and not the ICC is supplied then
the intra-cluster correlation is computed from the coefficient of variation using the formulae
from Hayes & Moulton. If incompatible values for `ICC`

and `cv_percent`

are supplied
then the value of the `ICC`

is used.

If geolocations are not input then power and sample size calculations are based on the scalar input parameters.

The calculations do not consider any loss in power due to loss to follow-up and by default there is no adjustment for effects of spillover.

Spillover bias can be allowed for using a diffusion model of mosquito movement. If no location or arm assignment information is available
then `contaminate_pop_pr`

is used to parameterize the model using a normal approximation for the distribution of distance
to discordant locations.

If a trial data frame or `'CRTsp'`

object is input then this is used to determine the number of locations. If this input object
contains cluster assignments then the numbers and sizes of clusters in the input data are used to estimate the power.
If `spillover_interval > 0`

and `distance_distribution = 'empirical'`

then effects of spillover are
incorporated into the power calculations based on the empirical distribution of distances to the nearest
discordant location. (If `distance_distribution ≠ 'empirical'`

then the distribution of distances is assumed to
be normal.

If buffer zones have been specified in the `'CRTsp'`

object then separate calculations are made for the core area and for the full site.

The output is an object of class `'CRTsp'`

containing any input trial data frame and values for:

The required numbers of clusters to achieve the specified power.

The design effect based on the input ICC.

Calculations of the power ignoring any bias caused by loss to follow-up etc.

Calculations of

`delta`

, the expected spillover bias.

## Examples

```
{# Power calculations for a binary outcome without input geolocations
examplePower1 <- CRTpower(locations = 3000, ICC = 0.10, effect = 0.4, alpha = 0.05,
outcome_type = 'd', desiredPower = 0.8, yC=0.35, c = 20, sd_h = 5)
summary(examplePower1)
# Power calculations for a rate outcome without input geolocations
examplePower2 <- CRTpower(locations = 2000, cv_percent = 40, effect = 0.4, denominator = 2.5,
alpha = 0.05, outcome_type = 'e', desiredPower = 0.8, yC = 0.35, c = 20, sd_h=5)
summary(examplePower2)
# Example with input geolocations
examplePower3 <- CRTpower(trial = readdata('example_site.csv'), desiredPower = 0.8,
effect=0.4, yC=0.35, outcome_type = 'd', ICC = 0.05, c = 20)
summary(examplePower3)
# Example with input geolocations, randomisation, and spillover
example4 <- randomizeCRT(specify_clusters(trial = readdata('example_site.csv'), c = 20))
examplePower4 <- CRTpower(trial = example4, desiredPower = 0.8,
effect=0.4, yC=0.35, outcome_type = 'd', ICC = 0.05, contaminate_pop_pr = 0.3)
summary(examplePower4)
}
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#> Locations and Clusters
#> ---------------------- -
#> Coordinate system No coordinates in dataset
#>
#> Not aggregated. Total records: 3000. Unique locations:
#> Available clusters (across both arms) 40
#> Per cluster mean number of points 75
#> Per cluster s.d. number of points 5
#>
#> No locations to randomize -
#>
#> Specification of Requirements
#> ----------------------------- -
#> Significance level (2-sided): 0.05
#> Type of Outcome dichotomous
#> Expected outcome in control arm: 0.35
#> Required effect size: 0.4
#> Intra-cluster correlation: 0.1
#>
#> Power calculations
#> ------------------ -
#> Design effect: 8.4
#> Calculations ignoring spillover
#> Nominal power (%) 84.5
#> Total clusters required (power of 80%): 38
#> Sufficient clusters for required power? Yes
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#> Locations and Clusters
#> ---------------------- -
#> Coordinate system No coordinates in dataset
#>
#> Not aggregated. Total records: 2000. Unique locations:
#> Available clusters (across both arms) 40
#> Per cluster mean number of points 50
#> Per cluster s.d. number of points 5
#>
#> No locations to randomize -
#>
#> Specification of Requirements
#> ----------------------------- -
#> Significance level (2-sided): 0.05
#> Type of Outcome event rate
#> Expected outcome in control arm: 0.35
#> Mean rate multiplier: 2.5
#> Required effect size: 0.4
#> Coefficient of variation (%): 40
#>
#> Power calculations
#> ------------------ -
#> Design effect: 7.8
#> Calculations ignoring spillover
#> Nominal power (%) 93.3
#> Total clusters required (power of 80%): 28
#> Sufficient clusters for required power? Yes
#> *** Assuming all clusters are the same size ***
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#>
#> Summary of coordinates
#> ----------------------
#> Min. : 1st Qu.: Median : Mean : 3rd Qu.: Max. :
#> x -3.20 -1.31 -0.24 0.00 1.35 5.16
#> y -5.08 -2.84 -0.17 0.00 2.49 6.16
#>
#> Total area (within 0.2 km of a location) : 27.6 sq.km
#> Total area (convex hull) : 48.2 sq.km
#>
#> Locations and Clusters
#> ---------------------- -
#> Coordinate system (x, y)
#>
#> Not aggregated. Total records: 3172. Unique locations: 1181
#> Available clusters (across both arms) 40
#> Per cluster mean number of points 79.3
#> Per cluster s.d. number of points 0
#> row 8
#> No randomization -
#>
#> Specification of Requirements
#> ----------------------------- -
#> Significance level (2-sided): 0.05
#> Type of Outcome dichotomous
#> Expected outcome in control arm: 0.35
#> Required effect size: 0.4
#> Intra-cluster correlation: 0.05
#>
#> Power calculations
#> ------------------ -
#> Design effect: 4.9
#> Calculations ignoring spillover
#> Nominal power (%) 98
#> Total clusters required (power of 80%): 20
#> Sufficient clusters for required power? Yes
#>
#> Other variables in dataset
#> -------------------------- RDT_test_result
#> ===============================CLUSTER RANDOMISED TRIAL ===========================
#>
#> Summary of coordinates
#> ----------------------
#> Min. : 1st Qu.: Median : Mean : 3rd Qu.: Max. :
#> x -3.20 -1.31 -0.24 0.00 1.35 5.16
#> y -5.08 -2.84 -0.17 0.00 2.49 6.16
#> nearestDiscord -1.15 -0.33 0.00 -0.00 0.33 1.55
#>
#> Total area (within 0.2 km of a location) : 27.6 sq.km
#> Total area (convex hull) : 48.2 sq.km
#>
#> Locations and Clusters
#> ---------------------- -
#> Coordinate system (x, y)
#>
#> Not aggregated. Total records: 3172. Unique locations: 1181
#> Available clusters (across both arms) 40
#> Per cluster mean number of points 79.3
#> Per cluster s.d. number of points 4.4
#> Cluster randomization: Independently randomized
#>
#> Specification of Requirements
#> ----------------------------- -
#> Significance level (2-sided): 0.05
#> Type of Outcome dichotomous
#> Expected outcome in control arm: 0.35
#> Required effect size: 0.4
#> Intra-cluster correlation: 0.05
#>
#> Power calculations
#> ------------------ -
#> Design effect: 4.9
#> Spillover affecting 30% of data,
#> normal model gives bias estimate: -0.041
#> Nominal power (%) 93.8
#> Total clusters required (power of 80%): 28
#> Sufficient clusters for required power? Yes
#>
#> Other variables in dataset
#> -------------------------- RDT_test_result
```