Type: | Package |
Title: | Statistical Inference via Lancaster Correlation |
Version: | 0.1.3 |
Maintainer: | Bernhard Klar <bernhard.klar@kit.edu> |
Description: | Implementation of the methods described in Holzmann, Klar (2024) <doi:10.1111/sjos.12733>. Lancaster correlation is a correlation coefficient which equals the absolute value of the Pearson correlation for the bivariate normal distribution, and is equal to or slightly less than the maximum correlation coefficient for a variety of bivariate distributions. Rank and moment-based estimators and corresponding confidence intervals are implemented, as well as independence tests based on these statistics. |
Imports: | arrangements, boot, graphics, sn, stats |
License: | GPL-2 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-08-22 07:21:55 UTC; Klar |
Author: | Bernhard Klar |
Repository: | CRAN |
Date/Publication: | 2025-08-22 07:50:10 UTC |
Covariance matrix of components of Lancaster correlation coefficient
Description
Estimate of covariance matrix of the two components of Lancaster correlation. Lancaster correlation is a bivariate measures of dependence.
Usage
Sigma.est(xx)
Arguments
xx |
a matrix or data frame with two columns. |
Details
For more details see the Appendix in Holzmann, Klar (2024).
Value
the estimated covariance matrix.
Author(s)
Hajo Holzmann, Bernhard Klar
References
Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733
See Also
Examples
Sigma <- matrix(c(1,0.1,0.1,1), ncol=2)
R <- chol(Sigma)
n <- 1000
x <- matrix(rnorm(n*2), n)
nu <- 8
y <- x / sqrt(rchisq(n, nu)/nu) #multivariate t
Sigma.est(y)
Lancaster correlation
Description
Computes the Lancaster correlation coefficient.
Usage
lcor(x, y = NULL, type = c("rank", "linear"))
Arguments
x |
a numeric vector, or a matrix or data frame with two columns. |
y |
NULL (default) or a vector with same length as x. |
type |
a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated. |
Details
Let F_X
and F_Y
be the distribution functions of X
and Y
, and define
X^* = \Phi^{-1}(F_X(X)), \quad Y^* = \Phi^{-1}(F_Y(Y)),
where \Phi^{-1}
is the standard normal quantile function. Furthermore for X
and Y
with finite fourth moment, let
\tilde{X} = (X - \mathbb{E}(X)) / \operatorname{sd}(X), \quad \tilde{Y} = (Y - \mathbb{E}(Y)) / \operatorname{sd}(Y).
Then
\rho_L(X,Y) = \max\{|\operatorname{Cor}_{\text{Pearson}}(X^*,Y^*)|,\; | \operatorname{Cor}_{\text{Pearson}}((X^*)^2,(Y^*)^2)|\}
and
\rho_{L,1}(X,Y) = \max\{|\operatorname{Cor}_{\text{Pearson}}(X,Y)|,\; | \operatorname{Cor}_{\text{Pearson}}((\tilde{X})^2,(\tilde{Y})^2)|\}
are called the Lancaster correlation coefficient and the linear Lancaster correlation coefficient, respectively. Two estimation methods are supported:
-
Linear estimator for
\bold{\rho_{L,1}}
(type = "linear"
): Consider\rho_{L1} = \operatorname{Cor}_{\text{Pearson}}(X,Y)
and\rho_{L2} = \operatorname{Cor}_{\text{Pearson}}((\tilde{X})^2,(\tilde{Y})^2)
. Let\hat\rho_{L1}
be the sample Pearson correlation and\hat\rho_{L2}
the empirical correlation of the squares of the empirically standardized observations, and set\hat\rho_{L,1} = \max\{\,|\hat\rho_{L1}|,\;|\hat\rho_{L2}|\,\}
. -
Rank-based estimator for
\bold{\rho_{L}}
(type = "rank"
): Consider\rho_{R1} = \operatorname{Cor}_{\text{Pearson}}(X^*,Y^*)
and\rho_{R2} = \operatorname{Cor}_{\text{Pearson}}((X^*)^2,(Y^*)^2)
. LetQ_i
andR_i
be the ranks ofX_i
andY_i
, withinX_1,...,X_n
orY_1,...,Y_n
respectively. Define\hat\rho_{R1} = \frac{1}{n\,s_a^2}\sum_{j=1}^n a(Q_j)\,a(R_j),
\hat\rho_{R2} = \frac{1}{n\,s_b^2}\sum_{j=1}^n \bigl(b(Q_j)-\bar b\bigr)\,\bigl(b(R_j)-\bar b\bigr),
where the scores are, for
j=1,...,n
,a(j) = \Phi^{-1}\!\Bigl(\frac{j}{n+1}\Bigr), \quad b(j)=a(j)^2,
\bar b=\frac{1}{n}\sum_{j=1}^n b(j), \quad s_a^2 = \frac{1}{n}\sum_{j=1}^n\bigl(a(j)-\bar a\bigr)^2, \quad s_b^2 = \frac{1}{n}\sum_{j=1}^n\bigl(b(j)-\bar b\bigr)^2.
Finally, the rankābased Lancaster correlation is
\hat\rho_{L} = \max\bigl\{\,|\hat\rho_{R1}|, |\hat\rho_{R2}|\bigr\}.
Value
the sample Lancaster correlation.
Author(s)
Hajo Holzmann, Bernhard Klar
References
Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733
See Also
Examples
Sigma <- matrix(c(1,0.1,0.1,1), ncol=2)
R <- chol(Sigma)
n <- 1000
x <- matrix(rnorm(n*2), n)
lcor(x, type = "rank")
lcor(x, type = "linear")
x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu)
cor(y[,1], y[,2], method = "spearman")
lcor(y, type = "rank")
Confidence intervals for the Lancaster correlation coefficient
Description
Computes confidence intervals for the Lancaster correlation coefficient. Lancaster correlation is a bivariate measures of dependence.
Usage
lcor.ci(
x,
y = NULL,
conf.level = 0.95,
type = c("rank", "linear"),
con = TRUE,
R = 1000,
method = c("plugin", "boot", "pretest")
)
Arguments
x |
a numeric vector, or a matrix or data frame with two columns. |
y |
NULL (default) or a vector with same length as x. |
conf.level |
confidence level of the interval. |
type |
a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated. |
con |
logical; if TRUE (default), conservative asymptotic confidence intervals are computed. |
R |
number of bootstrap replications. |
method |
a character string indicating how the asymptotic covariance matrix is computed if type ="linear". One of "plugin" (default), "boot" or "symmetric": can be abbreviated. |
Details
Computes asymptotic and bootstrap-based confidence intervals for the (linear) Lancaster correlation coefficient \rho_L
(\rho_{L,1}
). For more details see lcor
.
Asymptotic confidence intervals are derived under two cases (analogously for \rho_{L}
; see Holzmann and Klar (2024)):
Case 1: If |\rho_{L1}|\neq|\rho_{L2}|
, the 1-\alpha
asymptotic interval is
\left[ \max\{\hat\rho_{L,1} - z_{1-\alpha/2}\,s/\sqrt{n}, 0\},\ \min\{\hat\rho_{L,1} + z_{1-\alpha/2}\,s/\sqrt{n}, 1\} \right],
where z_{1-\alpha/2}
is the standard normal quantile and s
is an estimator of the corresponding standard deviation.
Case 2: If |\rho_{L1}|=|\rho_{L2}|=a>0
, let \tau
denote the correlation between the two components and let q_{1-\alpha/2}
be the 1-\alpha/2
quantile of the asymptotic distribution of \sqrt{n}(\hat\rho_{L,1} - a)
. A conservative asymptotic interval is
\left[ \max\{\hat\rho_{L,1} - q_{1-\alpha/2}/\sqrt{n}, 0\},\ \min\{\hat\rho_{L,1} + z_{1-\alpha/2}\,s/\sqrt{n}, 1\} \right].
Additionally, bootstrap-based intervals can be obtained by resampling and estimating the covariance matrix of the rank or linear correlation components.
Value
a vector containing the lower and upper limits of the confidence interval.
Author(s)
Hajo Holzmann, Bernhard Klar
References
Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733
See Also
Examples
n <- 1000
x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu) # multivariate t
lcor(y, type = "rank")
lcor.ci(y, type = "rank")
Lancaster correlation and its components
Description
Computes the Lancaster correlation coefficient and its components.
Usage
lcor.comp(x, y = NULL, type = c("rank", "linear"), plot = FALSE)
Arguments
x |
a numeric vector, or a matrix or data frame with two columns. |
y |
NULL (default) or a vector with same length as x. |
type |
a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated. |
plot |
logical; if TRUE, scatterplots of the transformed x and y values and of their squares are drawn. |
Details
For more details see lcor
.
Value
a vector containing the two components rho1 and rho2 and the sample Lancaster correlation.
Author(s)
Hajo Holzmann, Bernhard Klar
References
Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733
See Also
Examples
Sigma <- matrix(c(1,0.1,0.1,1), ncol=2)
R <- chol(Sigma)
n <- 1000
x <- matrix(rnorm(n*2), n)
nu <- 8
y <- x / sqrt(rchisq(n, nu)/nu) #multivariate t
cor(y[,1], y[,2])
lcor.comp(y, type = "linear")
x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu) #multivariate t
cor(y[,1], y[,2], method = "spearman")
lcor.comp(y, type = "rank", plot = TRUE)
Lancaster correlation test
Description
Lancaster correlation test of bivariate independence. Lancaster correlation is a bivariate measures of dependence.
Usage
lcor.test(
x,
y = NULL,
type = c("rank", "linear"),
nperm = 999,
method = c("permutation", "asymptotic", "symmetric")
)
Arguments
x |
a numeric vector, or a matrix or data frame with two columns. |
y |
NULL (default) or a vector with same length as x |
type |
a character string indicating which lancaster correlation is to be computed. One of "rank" (default), or "linear": can be abbreviated. |
nperm |
number of permutations. |
method |
a character string indicating how the p-value is computed if type ="linear". One of "permutation" (default), "asymptotic" or "symmetric": can be abbreviated. |
Details
For more details on the testing procedure see Remark \, 2
in Holzmann, Klar (2024).
Value
A list containing the following components:
lcor |
the value of the test statistic |
pval |
the p-value of the test |
Author(s)
Hajo Holzmann, Bernhard Klar
References
Holzmann, Klar (2024). "Lancester correlation - a new dependence measure linked to maximum correlation". doi:10.1111/sjos.12733
See Also
lcor, lcor.comp, lcor.ci
and for for performing an ACE permutation test of independence see acepack
(https://cran.r-project.org/package=acepack).
Examples
n <- 200
x <- matrix(rnorm(n*2), n)
nu <- 2
y <- x / sqrt(rchisq(n, nu)/nu)
cor.test(y[,1], y[,2], method = "spearman")
lcor.test(y, type = "rank")