| Title: | Direction Dependence Analysis |
|---|---|
| Description: | A collection of tests to analyze the causal direction of dependence in linear models (Wiedermann, W., & von Eye, A., 2025, ISBN: 9781009381390). The package includes functions to perform Direction Dependence Analysis for variable distributions, residual distributions, and independence properties of predictors and residuals in competing causal models. In addition, the package contains functions to test the causal direction of dependence in conditional models (i.e., models with interaction terms) For more information see <https://www.ddaproject.com>. |
| Authors: | Wolfgang Wiedermann [aut, cre], Megan Hirni [aut] |
| Maintainer: | Wolfgang Wiedermann <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-05-31 20:15:46 UTC |
| Source: | https://github.com/wwiedermann/dda |
A collection of tests to analyze the causal direction of dependence in linear models (Wiedermann, W., & von Eye, A., 2025, ISBN: 9781009381390). The package includes functions to perform Direction Dependence Analysis for variable distributions, residual distributions, and independence properties of predictors and residuals in competing causal models. In addition, the package contains functions to test the causal direction of dependence in conditional models (i.e., models with interaction terms) For more information see https://www.ddaproject.com.
Maintainer: Wolfgang Wiedermann [email protected]
Authors:
Megan Hirni [email protected]
Useful links:
cdda.indep evaluates asymmetries of
predictor-error independence of competing conditional models
(y ~ x * m vs. x ~ y * m with m being a
continuous or categorical moderator). print returns the
output of standard linear model coefficients for causally competing
target and alternative models. plot returns graphs of
cdda.indep results. summary returns test statistics
from the cdda.indep class object.
cdda.indep( formula = NULL, pred = NULL, mod = NULL, modval = NULL, data = list(), hetero = FALSE, diff = FALSE, nlfun = NULL, hsic.method = "gamma", B = 200, boot.type = "perc", conf.level = 0.95, parallelize = FALSE, cores = 1, ... ) ## S3 method for class 'cdda.indep' print(x, ...) ## S3 method for class 'cdda.indep' plot(x = NULL, stat = NULL, ylim = NULL, ...) ## S3 method for class 'cdda.indep' summary( object, nlfun = FALSE, hetero = FALSE, hsic = TRUE, hsic.diff = FALSE, dcor = TRUE, dcor.diff = FALSE, mi.diff = FALSE, ... )cdda.indep( formula = NULL, pred = NULL, mod = NULL, modval = NULL, data = list(), hetero = FALSE, diff = FALSE, nlfun = NULL, hsic.method = "gamma", B = 200, boot.type = "perc", conf.level = 0.95, parallelize = FALSE, cores = 1, ... ) ## S3 method for class 'cdda.indep' print(x, ...) ## S3 method for class 'cdda.indep' plot(x = NULL, stat = NULL, ylim = NULL, ...) ## S3 method for class 'cdda.indep' summary( object, nlfun = FALSE, hetero = FALSE, hsic = TRUE, hsic.diff = FALSE, dcor = TRUE, dcor.diff = FALSE, mi.diff = FALSE, ... )
formula |
Symbolic formula of the model to be tested or an
|
pred |
A character indicating the variable name of the predictor which serves as the outcome in the alternative model. |
mod |
A character indicating the variable name of the moderator. |
modval |
Characters or a numeric sequence specifying the
moderator values used in post-hoc probing. Possible characters
include |
data |
A required data frame containing the variables in the model. |
hetero |
A logical value indicating whether separate
homoscedasticity tests (i.e., standard and robust Breusch-Pagan
tests) should be computed. When used in |
diff |
A logical value indicating whether differences in
HSIC, dCor, and MI values should be computed. Bootstrap confidence
intervals are computed using |
nlfun |
Determines handling of non-linear correlation tests depending on the function used:
|
hsic.method |
A character indicating the inference method for
the Hilbert-Schmidt Independence Criterion. Must be one of
|
B |
Number of permutation replicates for separate dCor
tests, and number of resamples when
|
boot.type |
A character indicating the type of bootstrap
confidence intervals. Must be one of |
conf.level |
Confidence level for bootstrap confidence intervals. |
parallelize |
A logical value indicating whether bootstrapping
is performed on multiple cores. Only used if |
cores |
A numeric value indicating the number of cores.
Only used if |
... |
Additional arguments to be passed to the function. |
x |
An object of class |
stat |
A character indicating the CDDA statistic to be
plotted. Must be one of
|
ylim |
A numeric vector of length 2 indicating the y-axis
limits for |
object |
An object of class |
hsic |
A logical value indicating whether separate HSIC
tests should be returned in |
hsic.diff |
A logical value indicating whether HSIC difference
statistics should be returned in |
dcor |
A logical value indicating whether separate
Distance Correlation (dCor) tests should be returned in
|
dcor.diff |
A logical value indicating whether dCor difference
statistics should be returned in |
mi.diff |
A logical value indicating whether Mutual
Information (MI) difference statistics should be returned in
|
A list of class cdda.indep containing the results of
independence tests of Conditional Direction Dependence Analysis for
pre-specified moderator values.
Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.
dda.indep for an unconditional version.
set.seed(321) n <- 700 ## --- generate moderator z <- sort(rnorm(n)) z1 <- z[z <= 0] z2 <- z[z > 0] ## --- x -> y when z <= 0 x1 <- rchisq(length(z1), df = 4) - 4 e1 <- rchisq(length(z1), df = 3) - 3 y1 <- 0.5 * x1 + e1 ## --- y -> x when z > 0 y2 <- rchisq(length(z2), df = 4) - 4 e2 <- rchisq(length(z2), df = 3) - 3 x2 <- 0.25 * y2 + e2 y <- c(y1, y2); x <- c(x1, x2) d <- data.frame(x, y, z) m <- lm(y ~ x * z, data = d) result <- cdda.indep(m, pred = "x", mod = "z", modval = c(-1, 1), data = d, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2, nlfun = 2, B = 2) print(result) plot(result, stat = "dcor.diff") summary(result, hetero = FALSE) ## Not run: # --- Larger bootstrap example result <- cdda.indep(m, pred = "x", mod = "z", modval = c(-1, 1), data = d, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2, nlfun = 2, B = 2000) print(result) plot(result, stat = "dcor.diff") summary(result, hetero = FALSE) ## End(Not run)set.seed(321) n <- 700 ## --- generate moderator z <- sort(rnorm(n)) z1 <- z[z <= 0] z2 <- z[z > 0] ## --- x -> y when z <= 0 x1 <- rchisq(length(z1), df = 4) - 4 e1 <- rchisq(length(z1), df = 3) - 3 y1 <- 0.5 * x1 + e1 ## --- y -> x when z > 0 y2 <- rchisq(length(z2), df = 4) - 4 e2 <- rchisq(length(z2), df = 3) - 3 x2 <- 0.25 * y2 + e2 y <- c(y1, y2); x <- c(x1, x2) d <- data.frame(x, y, z) m <- lm(y ~ x * z, data = d) result <- cdda.indep(m, pred = "x", mod = "z", modval = c(-1, 1), data = d, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2, nlfun = 2, B = 2) print(result) plot(result, stat = "dcor.diff") summary(result, hetero = FALSE) ## Not run: # --- Larger bootstrap example result <- cdda.indep(m, pred = "x", mod = "z", modval = c(-1, 1), data = d, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2, nlfun = 2, B = 2000) print(result) plot(result, stat = "dcor.diff") summary(result, hetero = FALSE) ## End(Not run)
cdda.vardist evaluates variable distributions of
competing conditional models (y ~ x * m vs.
x ~ y * m with m being a continuous or categorical
moderator). print returns the output of standard linear
model coefficients for causally competing target and alternative
models. plot returns graphs of cdda.vardist results.
summary returns test statistics from the
cdda.vardist class object.
cdda.vardist( formula, pred = NULL, mod = NULL, data = list(), modval = NULL, B = 200, boot.type = "bca", conf.level = 0.95 ) ## S3 method for class 'cdda.vardist' print(x, ...) ## S3 method for class 'cdda.vardist' plot(x, stat = NULL, ylim = NULL, ...) ## S3 method for class 'cdda.vardist' summary(object, skew = TRUE, coskew = FALSE, kurt = TRUE, cokurt = FALSE, ...)cdda.vardist( formula, pred = NULL, mod = NULL, data = list(), modval = NULL, B = 200, boot.type = "bca", conf.level = 0.95 ) ## S3 method for class 'cdda.vardist' print(x, ...) ## S3 method for class 'cdda.vardist' plot(x, stat = NULL, ylim = NULL, ...) ## S3 method for class 'cdda.vardist' summary(object, skew = TRUE, coskew = FALSE, kurt = TRUE, cokurt = FALSE, ...)
formula |
Symbolic formula of the model to be tested or an
|
pred |
A character indicating the variable name of the predictor which serves as the outcome in the alternative model. |
mod |
A character indicating the variable name of the moderator. |
data |
A required data frame containing the variables in the model. |
modval |
Characters or a numeric sequence specifying the
moderator values used in post-hoc probing. Possible characters
include |
B |
Number of bootstrap samples. |
boot.type |
A character indicating the type of bootstrap
confidence intervals. Must be one of |
conf.level |
Confidence level for bootstrap confidence intervals. |
x |
An object of class |
... |
Additional arguments to be passed to the function. |
stat |
A character indicating the statistic to be plotted.
Default is |
ylim |
A numeric vector of length 2 indicating the y-axis
limits for |
object |
An object of class |
skew |
A logical value indicating whether skewness
differences and separate D'Agostino skewness tests should be
returned when using |
coskew |
A logical value indicating whether co-skewness
differences should be returned when using |
kurt |
A logical value indicating whether excess kurtosis
differences and Anscombe-Glynn kurtosis tests should be returned
when using |
cokurt |
A logical value indicating whether co-kurtosis
differences should be returned when using |
An object of class cdda.vardist containing the results
of conditional direction dependence tests of variable
distributions.
Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.
dda.vardist for an unconditional version.
set.seed(321) n <- 700 ## --- generate moderator z <- sort(rnorm(n)) z1 <- z[z <= 0] z2 <- z[z > 0] ## --- x -> y when z <= 0 x1 <- rchisq(length(z1), df = 4) - 4 e1 <- rchisq(length(z1), df = 3) - 3 y1 <- 0.5 * x1 + e1 ## --- y -> x when z > 0 y2 <- rchisq(length(z2), df = 4) - 4 e2 <- rchisq(length(z2), df = 3) - 3 x2 <- 0.25 * y2 + e2 y <- c(y1, y2); x <- c(x1, x2) d <- data.frame(x, y, z) m <- lm(y ~ x * z, data = d) result <- cdda.vardist(m, pred = "x", mod = "z", B = 100, boot.type = "perc", modval = c(-1, 1), data = d) print(result) plot(result, stat = "rtanh", ylim = c(-0.05, 0.05)) summary(result, skew = FALSE, kurt = FALSE, coskew = TRUE) ## Not run: # --- Larger bootstrap example result <- cdda.vardist(m, pred = "x", mod = "z", B = 2000, modval = c(-1, 1), data = d) print(result) plot(result, stat = "rtanh", ylim = c(-0.05, 0.05)) summary(result, skew = FALSE, kurt = FALSE, coskew = TRUE) ## End(Not run)set.seed(321) n <- 700 ## --- generate moderator z <- sort(rnorm(n)) z1 <- z[z <= 0] z2 <- z[z > 0] ## --- x -> y when z <= 0 x1 <- rchisq(length(z1), df = 4) - 4 e1 <- rchisq(length(z1), df = 3) - 3 y1 <- 0.5 * x1 + e1 ## --- y -> x when z > 0 y2 <- rchisq(length(z2), df = 4) - 4 e2 <- rchisq(length(z2), df = 3) - 3 x2 <- 0.25 * y2 + e2 y <- c(y1, y2); x <- c(x1, x2) d <- data.frame(x, y, z) m <- lm(y ~ x * z, data = d) result <- cdda.vardist(m, pred = "x", mod = "z", B = 100, boot.type = "perc", modval = c(-1, 1), data = d) print(result) plot(result, stat = "rtanh", ylim = c(-0.05, 0.05)) summary(result, skew = FALSE, kurt = FALSE, coskew = TRUE) ## Not run: # --- Larger bootstrap example result <- cdda.vardist(m, pred = "x", mod = "z", B = 2000, modval = c(-1, 1), data = d) print(result) plot(result, stat = "rtanh", ylim = c(-0.05, 0.05)) summary(result, skew = FALSE, kurt = FALSE, coskew = TRUE) ## End(Not run)
dda.indep evaluates asymmetries of predictor-error
independence of causally competing models (y ~ x vs.
x ~ y). print returns DDA test statistics associated
with dda.indep objects.
dda.indep( formula, pred = NULL, data = list(), nlfun = NULL, hetero = FALSE, hsic.method = "gamma", diff = FALSE, B = 200, boot.type = "perc", conf.level = 0.95, parallelize = FALSE, cores = 1 ) ## S3 method for class 'dda.indep' print(x, ...)dda.indep( formula, pred = NULL, data = list(), nlfun = NULL, hetero = FALSE, hsic.method = "gamma", diff = FALSE, B = 200, boot.type = "perc", conf.level = 0.95, parallelize = FALSE, cores = 1 ) ## S3 method for class 'dda.indep' print(x, ...)
formula |
Symbolic formula of the model to be tested or an
|
pred |
A character indicating the variable name of the predictor which serves as the outcome in the alternative model. |
data |
An optional data frame containing the variables in
the model (by default variables are taken from the environment which
|
nlfun |
Either a numeric value or a function of
|
hetero |
A logical value indicating whether separate homoscedasticity tests (i.e., standard and robust Breusch-Pagan tests) should be computed. |
hsic.method |
A character indicating the inference method for the
Hilbert-Schmidt Independence Criterion (HSIC). Must be one of
|
diff |
A logical value indicating whether differences in
HSIC, Distance Correlation (dCor), and Mutual Information (MI)
values should be computed. Bootstrap confidence intervals are
computed using |
B |
Number of permutation replicates for separate dCor
tests, and number of resamples when
|
boot.type |
A character indicating the type of bootstrap
confidence intervals. Must be one of |
conf.level |
Confidence level for bootstrap confidence intervals. |
parallelize |
A logical value indicating whether bootstrapping is
performed on multiple cores. Only used if |
cores |
A numeric value indicating the number of cores.
Only used if |
x |
An object of class |
... |
Additional arguments to be passed to the function. |
An object of class dda.indep containing the results of
independence tests of Direction Dependence Analysis.
Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.
cdda.indep for a conditional version.
set.seed(123) n <- 500 x <- rchisq(n, df = 4) - 4 e <- rchisq(n, df = 3) - 3 y <- 0.5 * x + e d <- data.frame(x, y) ## --- quick example (small B for speed) result <- dda.indep(y ~ x, pred = "x", data = d, nlfun = 2, B = 10, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2) print(result) ## Not run: # --- Larger bootstrap example result <- dda.indep(y ~ x, pred = "x", data = d, nlfun = 2, B = 2000, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2) print(result) ## End(Not run)set.seed(123) n <- 500 x <- rchisq(n, df = 4) - 4 e <- rchisq(n, df = 3) - 3 y <- 0.5 * x + e d <- data.frame(x, y) ## --- quick example (small B for speed) result <- dda.indep(y ~ x, pred = "x", data = d, nlfun = 2, B = 10, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2) print(result) ## Not run: # --- Larger bootstrap example result <- dda.indep(y ~ x, pred = "x", data = d, nlfun = 2, B = 2000, hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2) print(result) ## End(Not run)
dda.resdist evaluates patterns of asymmetry of
error distributions of causally competing models (y ~ x vs.
x ~ y). print returns DDA test statistics associated
with dda.resdist objects.
dda.resdist( formula, pred = NULL, data = list(), B = 200, boot.type = "perc", prob.trans = FALSE, conf.level = 0.95 ) ## S3 method for class 'dda.resdist' print(x, ...)dda.resdist( formula, pred = NULL, data = list(), B = 200, boot.type = "perc", prob.trans = FALSE, conf.level = 0.95 ) ## S3 method for class 'dda.resdist' print(x, ...)
formula |
Symbolic formula of the target model to be tested
or an |
pred |
Variable name of the predictor which serves as the outcome in the alternative model. |
data |
An optional data frame containing the variables in
the model (by default variables are taken from the environment which
|
B |
Number of bootstrap samples. |
boot.type |
A character indicating the type of bootstrap
confidence intervals. Must be one of |
prob.trans |
A logical value indicating whether a probability integral transformation should be performed prior to computation of skewness and kurtosis tests. |
conf.level |
Confidence level for bootstrap confidence intervals. |
x |
An object of class |
... |
Additional arguments to be passed to the method. |
An object of class dda.resdist containing the results
of direction dependence tests of error distributions.
Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.
set.seed(123) n <- 500 x <- rchisq(n, df = 4) - 4 e <- rchisq(n, df = 3) - 3 y <- 0.5 * x + e d <- data.frame(x, y) result <- dda.resdist(y ~ x, pred = "x", data = d, B = 50, conf.level = 0.90, prob.trans = TRUE) print(result) ## Not run: # --- Larger bootstrap example result <- dda.resdist(y ~ x, pred = "x", data = d, B = 2000, conf.level = 0.90) print(result) ## End(Not run)set.seed(123) n <- 500 x <- rchisq(n, df = 4) - 4 e <- rchisq(n, df = 3) - 3 y <- 0.5 * x + e d <- data.frame(x, y) result <- dda.resdist(y ~ x, pred = "x", data = d, B = 50, conf.level = 0.90, prob.trans = TRUE) print(result) ## Not run: # --- Larger bootstrap example result <- dda.resdist(y ~ x, pred = "x", data = d, B = 2000, conf.level = 0.90) print(result) ## End(Not run)
dda.vardist evaluates patterns of asymmetry of
variable distributions for causally competing models (y ~ x
vs. x ~ y). print returns DDA test statistics
associated with dda.vardist objects.
dda.vardist( formula, pred = NULL, data = list(), B = 200, boot.type = "perc", conf.level = 0.95 ) ## S3 method for class 'dda.vardist' print(x, ...)dda.vardist( formula, pred = NULL, data = list(), B = 200, boot.type = "perc", conf.level = 0.95 ) ## S3 method for class 'dda.vardist' print(x, ...)
formula |
Symbolic formula of the model to be tested or an
|
pred |
Variable name of the predictor which serves as the outcome in the alternative model. |
data |
An optional data frame containing the variables in
the model (by default variables are taken from the environment which
|
B |
Number of bootstrap samples. |
boot.type |
A character indicating the type of bootstrap
confidence intervals. Must be one of |
conf.level |
Confidence level for bootstrap confidence intervals. |
x |
An object of class |
... |
Additional arguments to be passed to the function. |
An object of class dda.vardist containing the results
of direction dependence tests of variable distributions.
Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.
cdda.vardist for a conditional version.
set.seed(123) n <- 500 x <- rchisq(n, df = 4) - 4 e <- rchisq(n, df = 3) - 3 y <- 0.5 * x + e d <- data.frame(x, y) result <- dda.vardist(y ~ x, pred = "x", data = d, boot.type = "perc", B = 100) print(result) ## Not run: # --- Larger bootstrap example result <- dda.vardist(y ~ x, pred = "x", data = d, B = 2000) print(result) ## End(Not run)set.seed(123) n <- 500 x <- rchisq(n, df = 4) - 4 e <- rchisq(n, df = 3) - 3 y <- 0.5 * x + e d <- data.frame(x, y) result <- dda.vardist(y ~ x, pred = "x", data = d, boot.type = "perc", B = 100) print(result) ## Not run: # --- Larger bootstrap example result <- dda.vardist(y ~ x, pred = "x", data = d, B = 2000) print(result) ## End(Not run)
hsic computes the empirical Hilbert-Schmidt
Independence Criterion between two variables X and Y using the
biased V-statistic
.
All kernel and centering computations are performed in C++.
hsic( x, y, kernel_x = "gaussian", kernel_y = kernel_x, bandwidth_x = NULL, bandwidth_y = NULL, degree = 2L, coef0 = 1 )hsic( x, y, kernel_x = "gaussian", kernel_y = kernel_x, bandwidth_x = NULL, bandwidth_y = NULL, degree = 2L, coef0 = 1 )
x |
A numeric vector of length n or matrix (n x p). |
y |
A numeric vector of length n or matrix (n x q). |
kernel_x |
Kernel for X. One of |
kernel_y |
Kernel for Y. Defaults to |
bandwidth_x |
Bandwidth for the X kernel. The median heuristic
( |
bandwidth_y |
Bandwidth for the Y kernel. Same options as
|
degree |
Integer degree for the polynomial kernel. Default
|
coef0 |
Constant term for the polynomial kernel. Default
|
The Gaussian kernel uses
where bandwidth = . The median heuristic sets
to the median of all strictly positive squared pairwise
distances, matching the convention in the dHSIC package.
A single non-negative numeric value: the raw HSIC estimate
. A value of
zero (with a characteristic kernel) implies independence.
Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20.
Peters, J., Pfister, N., & Mooij, J. M. (2022). dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion. R package version 2.1. https://CRAN.R-project.org/package=dHSIC
set.seed(12) x <- rnorm(100) hsic(x, rnorm(100)) hsic(x, x + rnorm(100, sd = 0.5)) hsic(x, rnorm(100), bandwidth_x = 1, bandwidth_y = 1)set.seed(12) x <- rnorm(100) hsic(x, rnorm(100)) hsic(x, x + rnorm(100, sd = 0.5)) hsic(x, rnorm(100), bandwidth_x = 1, bandwidth_y = 1)
hsic.test tests whether X and Y are independent
using the Hilbert-Schmidt Independence Criterion. The test statistic
is , labelled "HSIC",
and is consistent across all four inference methods.
Available null-distribution methods:
"gamma"Fits a Gamma distribution to the first two null moments of HSIC analytically (Peters, Pfister & Mooij, 2022). No resampling required.
"permutation"Permutes the row index of X to simulate the null distribution entirely in C++.
"eigenvalue"Derives the null distribution from the
eigenvalue spectrum of the centered kernel matrices (Zhang, 2011).
More accurate in small samples; uses B Monte
Carlo draws. Reports with
p-value from the spectral null.
"bootstrap"Independently resamples rows of the kernel matrices with replacement in C++ (Peters, Pfister & Mooij, 2022).
hsic.test( x, y, method = c("gamma", "permutation", "eigenvalue", "bootstrap"), kernel_x = "gaussian", kernel_y = kernel_x, bandwidth_x = NULL, bandwidth_y = NULL, degree = 2L, coef0 = 1, B = 1000L )hsic.test( x, y, method = c("gamma", "permutation", "eigenvalue", "bootstrap"), kernel_x = "gaussian", kernel_y = kernel_x, bandwidth_x = NULL, bandwidth_y = NULL, degree = 2L, coef0 = 1, B = 1000L )
x |
A numeric vector of length n or matrix (n x p). |
y |
A numeric vector of length n or matrix (n x q). |
method |
Inference method for the null distribution. One of
|
kernel_x |
Kernel for X. One of |
kernel_y |
Kernel for Y. Defaults to |
bandwidth_x |
Bandwidth ( |
bandwidth_y |
Bandwidth for the Y kernel. Same options as
|
degree |
Integer degree for the polynomial kernel. Default
|
coef0 |
Constant term for the polynomial kernel. Default
|
B |
Number of permutation or bootstrap replicates, or
Monte Carlo draws for |
All four methods report as
the test statistic. Permutation and bootstrap p-values use the
Laplace correction
.
An object of class "htest" with components:
statisticThe test statistic
, labelled
"HSIC".
p.valueP-value for the test of independence.
bandwidthsResolved bandwidths c(x, y).
Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20.
Peters, J., Pfister, N., & Mooij, J. M. (2022). dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion. R package version 2.1. https://CRAN.R-project.org/package=dHSIC
Zhang, K., Peters, J., Janzing, D., & Scholkopf, B. (2011). Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2011) (pp. 804-813).
set.seed(7) n <- 80 x <- rnorm(n) hsic.test(x, rnorm(n), method = "gamma") hsic.test(x, rnorm(n), method = "permutation", B = 499) hsic.test(x, x + rnorm(n), method = "permutation", B = 499)set.seed(7) n <- 80 x <- rnorm(n) hsic.test(x, rnorm(n), method = "gamma") hsic.test(x, rnorm(n), method = "permutation", B = 499) hsic.test(x, x + rnorm(n), method = "permutation", B = 499)
Returns the median of all strictly positive squared
pairwise Euclidean distances, interpreted as the variance
parameter for the Gaussian kernel
. This is the same
convention used in the dHSIC source code (Peters et al.,
2022).
median_bandwidth(x)median_bandwidth(x)
x |
A numeric vector or matrix of observations (n rows). |
A single strictly positive numeric: the bandwidth
, ready to pass as bandwidth_x or
bandwidth_y. Returns 1 when all pairwise distances are zero.
Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20.
Peters, J., Pfister, N., & Mooij, J. M. (2022). dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion. R package version 2.1.
x <- rnorm(80) median_bandwidth(x)x <- rnorm(80) median_bandwidth(x)