--- title: "Two Binary Co-Primary Endpoints (Exact Methods)" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Two Binary Co-Primary Endpoints (Exact Methods)} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Overview This vignette demonstrates exact sample size calculation and power analysis for clinical trials with two co-primary binary endpoints. The methodology is based on Homma and Yoshida (2025), which provides exact inference methods using the bivariate binomial distribution. ```{r setup, message=FALSE, warning=FALSE} library(twoCoprimary) library(dplyr) library(tidyr) library(knitr) ``` ## Background ### When to Use Exact Methods Exact methods are recommended when: - **Small to medium sample sizes** ($N < 200$) - **Extreme probabilities** ($p < 0.10$ or $p > 0.90$) - **Strict Type I error control** is required - **Regulatory requirements** for exact inference Asymptotic methods may not maintain the nominal Type I error rate in these situations. ### Advantages of Exact Methods 1. **Accurate Type I error control**: Exact tests guarantee $\alpha \leq$ nominal level 2. **Better small-sample performance**: No reliance on asymptotic approximations 3. **Valid for extreme probabilities**: No restrictions on $p$ values 4. **Regulatory acceptance**: Often preferred by regulatory agencies ### Disadvantages 1. **Computational intensity**: Requires enumeration of possible outcomes 2. **Conservatism**: Discrete nature can lead to conservatism 3. **Implementation complexity**: More complex than asymptotic methods ## Statistical Framework ### Model and Assumptions Consider a two-arm parallel-group superiority trial comparing treatment (group 1) with control (group 2). Let $n_{1}$ and $n_{2}$ denote the sample sizes in groups 1 and 2, respectively. For patient $i$ in group $j$ ($j = 1$: treatment, $j = 2$: control), we observe two binary outcomes: **Endpoint $k$** ($k = 1, 2$): $$X_{i,j,k} \in \{0, 1\}$$ where $X_{i,j,k} = 1$ if patient $i$ in group $j$ is a responder for endpoint $k$, and 0 otherwise. **True response probabilities**: $$p_{j,k} = \text{P}(X_{i,j,k} = 1)$$ where $0 < p_{j,k} < 1$ for each $j$ and $k$. ### Joint Distribution of Binary Outcomes The paired binary outcomes $(X_{i,j,1}, X_{i,j,2})$ for patient $i$ in group $j$ follow a multinomial distribution with four possible outcomes: **Per-trial probabilities**: - $p_{j}^{(1,1)} = \phi_{j}$: Both endpoints successful - $p_{j}^{(1,0)} = p_{j,1} - \phi_{j}$: Only endpoint 1 successful - $p_{j}^{(0,1)} = p_{j,2} - \phi_{j}$: Only endpoint 2 successful - $p_{j}^{(0,0)} = 1 - p_{j,1} - p_{j,2} + \phi_{j}$: Both endpoints unsuccessful where $\phi_{j} = \text{P}(X_{i,j,1} = 1, X_{i,j,2} = 1)$. Let $Z_{j}^{(\ell,m)}$ denote the random variable representing the number of times $\{(X_{i,j,1}, X_{i,j,2}) : i = 1, \ldots, n_{j}\}$ takes the value $(\ell, m)$ for $\ell, m \in \{0, 1\}$. Then: $$(Z_{j}^{(0,0)}, Z_{j}^{(1,0)}, Z_{j}^{(0,1)}, Z_{j}^{(1,1)}) \sim \text{Multinomial}(n_{j}; p_{j}^{(0,0)}, p_{j}^{(1,0)}, p_{j}^{(0,1)}, p_{j}^{(1,1)})$$ ### Number of Responders Let $Y_{j,k} = \sum_{i=1}^{n_{j}} X_{i,j,k}$ represent the number of responders in group $j$ for endpoint $k$. Then: - $Y_{j,1} = Z_{j}^{(1,1)} + Z_{j}^{(1,0)}$ - $Y_{j,2} = Z_{j}^{(1,1)} + Z_{j}^{(0,1)}$ ### Bivariate Binomial Distribution Following Homma and Yoshida (2025), the joint distribution of $(Y_{j,1}, Y_{j,2})$ can be expressed as a **bivariate binomial distribution**: $$(Y_{j,1}, Y_{j,2}) \sim \text{BiBin}(n_{j}, p_{j,1}, p_{j,2}, \gamma_{j})$$ where $\gamma_{j}$ is a dependence parameter related to the correlation $\rho_{j}$ between $X_{i,j,1}$ and $X_{i,j,2}$. **Probability mass function** (Equation 3 in Homma and Yoshida, 2025): $$\text{P}(Y_{j,1} = y_{j,1}, Y_{j,2} = y_{j,2} \mid n_{j}, p_{j,1}, p_{j,2}, \gamma_{j}) = f(y_{j,1} \mid n_{j}, p_{j,1}) \times g(y_{j,2} \mid y_{j,1}, n_{j}, p_{j,1}, p_{j,2}, \gamma_{j})$$ For more details, please see Homma and Yoshida (2025). ### Correlation Structure The **correlation** $\rho_{j}$ between $X_{i,j,1}$ and $X_{i,j,2}$ is: $$\rho_{j} = \text{Cor}(X_{i,j,1}, X_{i,j,2}) = \frac{\phi_{j} - p_{j,1} p_{j,2}}{\sqrt{p_{j,1}(1 - p_{j,1}) p_{j,2}(1 - p_{j,2})}}$$ The dependence parameter $\gamma_{j}$ is related to $\rho_{j}$ through (Equation 4 in Homma and Yoshida, 2025): $$\gamma_{j} = \gamma(\rho_{j}, p_{j,1}, p_{j,2}) = \rho_{j} \sqrt{\frac{p_{j,2}(1 - p_{j,2})}{p_{j,1}(1 - p_{j,1})}} \left(1 - \rho_{j} \sqrt{\frac{p_{j,2}(1 - p_{j,2})}{p_{j,1}(1 - p_{j,1})}}\right)^{-1}$$ **Important property**: The correlation between $Y_{j,1}$ and $Y_{j,2}$ equals $\rho_{j}$, the same as the correlation between $X_{i,j,1}$ and $X_{i,j,2}$. **Marginal distributions**: $$Y_{j,k} \sim \text{Bin}(n_{j}, p_{j,k})$$ **Correlation bounds**: Due to $0 < p_{j,k} < 1$, the correlation $\rho_{j}$ is bounded: $$\rho_{j} \in [L(p_{j,1}, p_{j,2}), U(p_{j,1}, p_{j,2})] \subseteq [-1, 1]$$ where: $$L(p_{j,1}, p_{j,2}) = \max\left\{-\sqrt{\frac{p_{j,1} p_{j,2}}{(1 - p_{j,1})(1 - p_{j,2})}}, -\sqrt{\frac{(1 - p_{j,1})(1 - p_{j,2})}{p_{j,1} p_{j,2}}}\right\}$$ $$U(p_{j,1}, p_{j,2}) = \min\left\{\sqrt{\frac{p_{j,1}(1 - p_{j,2})}{p_{j,2}(1 - p_{j,1})}}, \sqrt{\frac{p_{j,2}(1 - p_{j,1})}{p_{j,1}(1 - p_{j,2})}}\right\}$$ **Special cases**: - If $p_{j,1} = p_{j,2}$, then $U(p_{j,1}, p_{j,2}) = 1$ - If $p_{j,1} + p_{j,2} = 1$, then $L(p_{j,1}, p_{j,2}) = -1$ ## Hypothesis Testing ### Superiority Hypotheses Since higher values of both endpoints indicate treatment benefit, we test: **For endpoint 1**: $$\text{H}_{0}^{(1)}: p_{1,1} \leq p_{2,1} \text{ vs. } \text{H}_{1}^{(1)}: p_{1,1} > p_{2,1}$$ **For endpoint 2**: $$\text{H}_{0}^{(2)}: p_{1,2} \leq p_{2,2} \text{ vs. } \text{H}_{1}^{(2)}: p_{1,2} > p_{2,2}$$ ### Co-Primary Endpoints (Intersection-Union Test) The trial succeeds only if superiority is demonstrated for **both** endpoints simultaneously: **Null hypothesis**: $\text{H}_{0} = \text{H}_{0}^{(1)} \cup \text{H}_{0}^{(2)}$ (at least one null is true) **Alternative hypothesis**: $\text{H}_{1} = \text{H}_{1}^{(1)} \cap \text{H}_{1}^{(2)}$ (both alternatives are true) **Decision rule**: Reject $\text{H}_{0}$ at level $\alpha$ if and only if **both** $\text{H}_{0}^{(1)}$ and $\text{H}_{0}^{(2)}$ are rejected at level $\alpha$ without multiplicity adjustment. ## Statistical Tests Homma and Yoshida (2025) consider five exact test methods: ### Method 1: One-sided Pearson Chi-squared Test (Chisq) For endpoint $k$, the test statistic is: $$Z(y_{1,k}, y_{2,k}) = \frac{\hat{p}_{1,k} - \hat{p}_{2,k}}{\sqrt{\hat{p}_{k}(1 - \hat{p}_{k})\left(\frac{1}{n_{1}} + \frac{1}{n_{2}}\right)}}$$ where: - $\hat{p}_{j,k} = y_{j,k} / n_{j}$ is the sample proportion - $\hat{p}_{k} = \frac{n_{1} \hat{p}_{1,k} + n_{2} \hat{p}_{2,k}}{n_{1} + n_{2}}$ is the pooled proportion Reject $\text{H}_{0}^{(k)}$ if $Z(y_{1,k}, y_{2,k}) > z_{1-\alpha}$, where $z_{1-\alpha}$ is the $(1-\alpha)$-quantile of the standard normal distribution. ### Method 2: Fisher's Exact Test (Fisher) **Conditional test**: Conditions on the total number of successes $y_{1,k} + y_{2,k}$. Under $\text{H}_{0}^{(k)}$, $Y_{1,k}$ follows a hypergeometric distribution given $Y_{1,k} + Y_{2,k} = y_{k}$. **One-sided p-value**: $$p_{k}^{\text{Fisher}} = \sum_{y=y_{1,k}}^{\min(n_{1}, y_{k})} \frac{\binom{n_{1}}{y} \binom{n_{2}}{y_{k} - y}}{\binom{n_{1} + n_{2}}{y_{k}}}$$ Reject $\text{H}_{0}^{(k)}$ if $p_{k}^{\text{Fisher}} < \alpha$. ### Method 3: Fisher's Mid-P Test (Fisher-midP) Reduces conservatism by adding half the probability of the observed outcome: $$p_{k}^{\text{mid-p}} = p_{k}^{\text{Fisher}} - \frac{1}{2} \times \frac{\binom{n_{1}}{y_{1,k}} \binom{n_{2}}{y_{k} - y_{1,k}}}{\binom{n_{1} + n_{2}}{y_{k}}}$$ Note: The `twoCoprimary` package can implement the Fisher's Mid-P Test, but Homma and Yoshida (2025) has not investigated this test. ### Method 4: Z-pooled Exact Unconditional Test (Z-pool) **Unconditional test**: Maximizes the p-value over all possible values of the nuisance parameter (common success probability $p_{k}$ under $\text{H}_{0}$). Uses the $Z$-test statistic and finds the maximum $p$-value across all possible values of $p_{k}$. ### Method 5: Boschloo's Exact Unconditional Test (Boschloo) Similar to Z-pooled, but based on Fisher's exact $p$-values. Maximizes Fisher's exact $p$-value over the nuisance parameter space. **Most powerful** of the exact unconditional tests, but computationally intensive. ## Exact Power Calculation ### Power Formula The exact power for test method $A$ is (Equation 9 in Homma and Yoshida, 2025): $$\text{power}_{A}(\boldsymbol{\theta}) = \text{P}\left[\bigcap_{k=1}^{2} \{p_{A}(y_{1,k}, y_{2,k}) < \alpha\} \mid \text{H}_{1}\right]$$ $$= \sum_{(a_{1,1}, a_{2,1}) \in \mathcal{A}_{1}} \sum_{(a_{1,2}, a_{2,2}) \in \mathcal{A}_{2}} f(a_{1,1} \mid n_{1}, p_{1,1}) \times f(a_{2,1} \mid n_{2}, p_{2,1}) \times g(a_{1,2} \mid a_{1,1}, n_{1}, p_{1,1}, p_{1,2}, \gamma_{1}) \times g(a_{2,2} \mid a_{2,1}, n_{2}, p_{2,1}, p_{2,2}, \gamma_{2})$$ where: - $\boldsymbol{\theta} = (p_{1,1}, p_{2,1}, p_{1,2}, p_{2,2}, n_{1}, n_{2}, \gamma_{1}, \gamma_{2})$ is the parameter vector - $\mathcal{A}_{k}$ is the rejection region for endpoint $k$ - $\mathcal{A}_{k} = \{(y_{1,k}, y_{2,k}) : p_{A}(y_{1,k}, y_{2,k}) < \alpha\}$ ### Sample Size Calculation The required sample size $n_{2}$ to achieve target power $1 - \beta$ is (Equation 10 in Homma and Yoshida, 2025): $$n_{2} = \arg\min_{n_{2} \in \mathbb{Z}} \{\text{power}_{A}(\boldsymbol{\theta}) \geq 1 - \beta\}$$ This cannot be expressed as a closed-form formula due to: 1. Discreteness of binary outcomes 2. Non-monotonic "sawtooth" power curve **Algorithm**: Sequential search starting from asymptotic normal approximation (AN method) as initial value. ## Replicating Homma and Yoshida (2025) Table 4 Table 4 from Homma and Yoshida (2025) shows sample sizes for various correlations using the Chisq, Fisher, Z-pool, and Boschloo. Note that the following sample code compute only scenario for $\alpha=0.025$. The notation used in the function is: `p11` = $p_{1,1}$, `p12` = $p_{1,2}$, `p21` = $p_{2,1}$, `p22` = $p_{2,2}$, where the first subscript denotes the group (1 = treatment, 2 = control) and the second subscript denotes the endpoint (1 or 2). ```{r table4_homma} # Recreate Homma and Yoshida (2025) Table 4 library(dplyr) library(tidyr) library(readr) param_grid_bin_exact_ss <- tibble( p11 = 0.54, p12 = 0.54, p21 = 0.25, p22 = 0.25 ) result_bin_exact_ss <- do.call( bind_rows, lapply(c("Chisq", "Fisher", "Z-pool", "Boschloo"), function(test) { do.call( bind_rows, lapply(1:2, function(r) { design_table( param_grid = param_grid_bin_exact_ss, rho_values = c(0, 0.3, 0.5, 0.8), r = r, alpha = 0.025, beta = 0.1, endpoint_type = "binary", Test = test ) %>% mutate(alpha = 0.025, r = r, Test = test) }) ) }) ) %>% pivot_longer( cols = starts_with("rho_"), names_to = "rho", values_to = "N", names_transform = list(rho = parse_number) ) %>% select(r, rho, Test, N) %>% pivot_wider(names_from = Test, values_from = N) %>% as.data.frame() kable(result_bin_exact_ss, caption = "Table 4: Total Sample Size (N) for Two Co-Primary Binary Endpoints (α = 0.025, 1-β = 0.90)^a,b^", digits = 1, col.names = c("r", "ρ", "Chisq", "Fisher", "Z-pool", "Boschloo")) ``` ^a^ Chisq denotes the one-sided Pearson chi-squared test. Fisher stands for Fisher’s exact test. Z-pool represents the Z-pooled exact unconditional test. Boschloo signifies Boschloo’s exact unconditional test. ^b^ The required sample sizes were obtained by assuming that $p_{1,1} = p_{1,2} = 0.54$ and $p_{2,1} = p_{2,2} = 0.25$. ## Practical Examples ### Example 1: Basic Exact Power Calculation ```{r example1} # Calculate exact power using Fisher's exact test result_fisher <- power2BinaryExact( n1 = 50, n2 = 50, p11 = 0.70, p12 = 0.65, p21 = 0.50, p22 = 0.45, rho1 = 0.5, rho2 = 0.5, alpha = 0.025, Test = "Fisher" ) print(result_fisher) ``` **Interpretation**: - `power1`: Power for endpoint 1 alone - `power2`: Power for endpoint 2 alone - `powerCoprimary`: Exact power for both co-primary endpoints ### Example 2: Sample Size Calculation ```{r example2} # Calculate required sample size using Boschloo's test result_ss <- ss2BinaryExact( p11 = 0.70, p12 = 0.65, p21 = 0.50, p22 = 0.45, rho1 = 0.5, rho2 = 0.5, r = 1, alpha = 0.025, beta = 0.2, Test = "Boschloo" ) print(result_ss) ``` ### Example 3: Comparison of Test Methods ```{r example3} # Compare different exact test methods test_methods <- c("Chisq", "Fisher", "Fisher-midP", "Z-pool", "Boschloo") comparison <- lapply(test_methods, function(test) { result <- ss2BinaryExact( p11 = 0.50, p12 = 0.40, p21 = 0.20, p22 = 0.10, rho1 = 0.7, rho2 = 0.6, r = 1, alpha = 0.025, beta = 0.2, Test = test ) data.frame( Test = test, n2 = result$n2, N = result$N ) }) comparison_table <- bind_rows(comparison) kable(comparison_table, caption = "Sample Size Comparison Across Test Methods", col.names = c("Test Method", "n per group", "N total")) ``` ## Impact of Correlation ### Example 4: Correlation Effect ```{r example4} # Calculate sample size for different correlation values rho_values <- c(0, 0.3, 0.5, 0.8) correlation_effect <- lapply(rho_values, function(rho) { result <- ss2BinaryExact( p11 = 0.70, p12 = 0.60, p21 = 0.40, p22 = 0.30, rho1 = rho, rho2 = rho, r = 1, alpha = 0.025, beta = 0.2, Test = "Fisher" ) data.frame( rho = rho, n2 = result$n2, N = result$N ) }) rho_table <- bind_rows(correlation_effect) kable(rho_table, caption = "Impact of Correlation on Sample Size (Fisher's Test)", col.names = c("ρ", "n per group", "N total")) ``` **Key finding**: Higher positive correlation reduces required sample size. ## Comparison: Exact vs Asymptotic ### Example 5: Exact vs AN Method ```{r example5} # Exact method (Chisq) exact_result <- ss2BinaryExact( p11 = 0.60, p12 = 0.40, p21 = 0.30, p22 = 0.10, rho1 = 0.5, rho2 = 0.5, r = 1, alpha = 0.025, beta = 0.1, Test = "Chisq" ) # Asymptotic method (AN) asymp_result <- ss2BinaryApprox( p11 = 0.60, p12 = 0.40, p21 = 0.30, p22 = 0.10, rho1 = 0.5, rho2 = 0.5, r = 1, alpha = 0.025, beta = 0.1, Test = "AN" ) comparison_exact_asymp <- data.frame( Method = c("Exact (Chisq)", "Asymptotic (AN)"), n_per_group = c(exact_result$n2, asymp_result$n2), N_total = c(exact_result$N, asymp_result$N), Difference = c(0, asymp_result$N - exact_result$N) ) kable(comparison_exact_asymp, caption = "Comparison: Exact vs Asymptotic Methods", col.names = c("Method", "n per group", "N total", "Difference")) ``` ## Practical Recommendations ### Test Method Selection 1. **Fisher's exact test**: - Most widely used and accepted - Conservative but guarantees Type I error control - Recommended for regulatory submissions 2. **Boschloo's test**: - Most powerful among exact tests - Best choice when computational resources permit - Recommended for final analysis 3. **Chi-squared test**: - Less conservative than Fisher - May be anti-conservative for small samples - Use with caution for $N < 200$ 4. **Z-pooled and Fisher-midP**: - Intermediate between Fisher and chi-squared - Reduce conservatism while maintaining validity ### When to Use Each Method **Sample size guidelines**: 1. **$N < 100$**: Always use exact methods 2. **$100 \leq N < 200$**: Exact methods preferred, especially if: - Extreme probabilities ($p < 0.1$ or $p > 0.9$) - Strict Type I error control required 3. **$N \geq 200$ and $0.1 < p < 0.9$**: Asymptotic methods acceptable ### Correlation Estimation - Use pilot data or historical information - Be conservative if uncertain (use $\rho = 0$) - Consider sensitivity analysis across plausible range ### Allocation Ratio - Balanced design ($r = 1$) generally most efficient - Unbalanced designs may be justified by: - Limited control group availability - Ethical considerations - Cost constraints ## Computational Considerations Modern computers handle all methods efficiently for typical clinical trial sample sizes ($N < 300$). ### Software Implementation The `twoCoprimary` package implements all methods efficiently using: - Bivariate binomial distribution (`dbibinom`) - Rejection region calculation (`rr1Binary`) - Vectorized computations for speed ## References Homma, G., & Yoshida, T. (2025). Exact power and sample size in clinical trials with two co-primary binary endpoints. *Statistical Methods in Medical Research*, 34(1), 1-19.