Package 'dda'

Title: Direction Dependence Analysis
Description: A collection of tests to analyze the causal direction of dependence in linear models (Wiedermann, W., & von Eye, A., 2025, ISBN: 9781009381390). The package includes functions to perform Direction Dependence Analysis for variable distributions, residual distributions, and independence properties of predictors and residuals in competing causal models. In addition, the package contains functions to test the causal direction of dependence in conditional models (i.e., models with interaction terms) For more information see <https://www.ddaproject.com>.
Authors: Wolfgang Wiedermann [aut, cre], Megan Hirni [aut]
Maintainer: Wolfgang Wiedermann <[email protected]>
License: MIT + file LICENSE
Version: 0.1.1
Built: 2026-05-31 20:15:46 UTC
Source: https://github.com/wwiedermann/dda

Help Index


dda: Direction Dependence Analysis

Description

A collection of tests to analyze the causal direction of dependence in linear models (Wiedermann, W., & von Eye, A., 2025, ISBN: 9781009381390). The package includes functions to perform Direction Dependence Analysis for variable distributions, residual distributions, and independence properties of predictors and residuals in competing causal models. In addition, the package contains functions to test the causal direction of dependence in conditional models (i.e., models with interaction terms) For more information see https://www.ddaproject.com.

Author(s)

Maintainer: Wolfgang Wiedermann [email protected]

Authors:

See Also

Useful links:


Conditional Direction Dependence Analysis: Independence Properties

Description

cdda.indep evaluates asymmetries of predictor-error independence of competing conditional models (y ~ x * m vs. x ~ y * m with m being a continuous or categorical moderator). print returns the output of standard linear model coefficients for causally competing target and alternative models. plot returns graphs of cdda.indep results. summary returns test statistics from the cdda.indep class object.

Usage

cdda.indep(
  formula = NULL,
  pred = NULL,
  mod = NULL,
  modval = NULL,
  data = list(),
  hetero = FALSE,
  diff = FALSE,
  nlfun = NULL,
  hsic.method = "gamma",
  B = 200,
  boot.type = "perc",
  conf.level = 0.95,
  parallelize = FALSE,
  cores = 1,
  ...
)

## S3 method for class 'cdda.indep'
print(x, ...)

## S3 method for class 'cdda.indep'
plot(x = NULL, stat = NULL, ylim = NULL, ...)

## S3 method for class 'cdda.indep'
summary(
  object,
  nlfun = FALSE,
  hetero = FALSE,
  hsic = TRUE,
  hsic.diff = FALSE,
  dcor = TRUE,
  dcor.diff = FALSE,
  mi.diff = FALSE,
  ...
)

Arguments

formula

Symbolic formula of the model to be tested or an lm object.

pred

A character indicating the variable name of the predictor which serves as the outcome in the alternative model.

mod

A character indicating the variable name of the moderator.

modval

Characters or a numeric sequence specifying the moderator values used in post-hoc probing. Possible characters include c("mean", "median", "JN"). modval = "mean" tests the interaction effect at the moderator values M - 1SD, M, and M + 1SD; modval = "median" uses Q1, Md, and Q3. The Johnson-Neyman approach is applied when modval = "JN" with conditional effects being evaluated at the boundary values of the significance regions. When a numeric sequence is specified, the pick-a-point approach is used for the selected numeric values.

data

A required data frame containing the variables in the model.

hetero

A logical value indicating whether separate homoscedasticity tests (i.e., standard and robust Breusch-Pagan tests) should be computed. When used in summary, a logical value indicating whether homoscedasticity test results should be returned in the output; default is FALSE.

diff

A logical value indicating whether differences in HSIC, dCor, and MI values should be computed. Bootstrap confidence intervals are computed using B bootstrap samples.

nlfun

Determines handling of non-linear correlation tests depending on the function used:

  • For cdda.indep: Either a numeric value or a function of .Primitive type. When numeric, the value is used in a power transformation.

  • For summary: A logical value indicating whether non-linear correlation tests should be returned in the output. Default is FALSE.

hsic.method

A character indicating the inference method for the Hilbert-Schmidt Independence Criterion. Must be one of c("gamma", "eigenvalue", "bootstrap", "permutation"). hsic.method = "gamma" is the default.

B

Number of permutation replicates for separate dCor tests, and number of resamples when hsic.method = c("bootstrap", "permutation") or diff = TRUE.

boot.type

A character indicating the type of bootstrap confidence intervals. Must be one of c("perc", "bca"). boot.type = "perc" is the default.

conf.level

Confidence level for bootstrap confidence intervals.

parallelize

A logical value indicating whether bootstrapping is performed on multiple cores. Only used if diff = TRUE.

cores

A numeric value indicating the number of cores. Only used if parallelize = TRUE.

...

Additional arguments to be passed to the function.

x

An object of class cdda.indep when using print or plot.

stat

A character indicating the CDDA statistic to be plotted. Must be one of c("hsic.diff", "dcor.diff", "mi.diff").

ylim

A numeric vector of length 2 indicating the y-axis limits for plot. If NULL, limits are set automatically.

object

An object of class cdda.indep when using summary.

hsic

A logical value indicating whether separate HSIC tests should be returned in summary output. Default is TRUE.

hsic.diff

A logical value indicating whether HSIC difference statistics should be returned in summary output. Default is FALSE.

dcor

A logical value indicating whether separate Distance Correlation (dCor) tests should be returned in summary output. Default is TRUE.

dcor.diff

A logical value indicating whether dCor difference statistics should be returned in summary output. Default is FALSE.

mi.diff

A logical value indicating whether Mutual Information (MI) difference statistics should be returned in summary output. Default is FALSE.

Value

A list of class cdda.indep containing the results of independence tests of Conditional Direction Dependence Analysis for pre-specified moderator values.

References

Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.

See Also

dda.indep for an unconditional version.

Examples

set.seed(321)
n <- 700

## --- generate moderator
z <- sort(rnorm(n))
z1 <- z[z <= 0]
z2 <- z[z > 0]

## --- x -> y when z <= 0
x1 <- rchisq(length(z1), df = 4) - 4
e1 <- rchisq(length(z1), df = 3) - 3
y1 <- 0.5 * x1 + e1

## --- y -> x when z > 0
y2 <- rchisq(length(z2), df = 4) - 4
e2 <- rchisq(length(z2), df = 3) - 3
x2 <- 0.25 * y2 + e2

y <- c(y1, y2); x <- c(x1, x2)
d <- data.frame(x, y, z)
m <- lm(y ~ x * z, data = d)

result <- cdda.indep(m,
  pred = "x", mod = "z", modval = c(-1, 1), data = d,
  hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2,
  nlfun = 2, B = 2)

print(result)
plot(result, stat = "dcor.diff")
summary(result, hetero = FALSE)

## Not run: 
# --- Larger bootstrap example
result <- cdda.indep(m,
  pred = "x", mod = "z", modval = c(-1, 1), data = d,
  hetero = TRUE, diff = TRUE, parallelize = FALSE, cores = 2,
  nlfun = 2, B = 2000)

print(result)
plot(result, stat = "dcor.diff")
summary(result, hetero = FALSE)

## End(Not run)

Conditional Direction Dependence Analysis: Variable Distributions

Description

cdda.vardist evaluates variable distributions of competing conditional models (y ~ x * m vs. x ~ y * m with m being a continuous or categorical moderator). print returns the output of standard linear model coefficients for causally competing target and alternative models. plot returns graphs of cdda.vardist results. summary returns test statistics from the cdda.vardist class object.

Usage

cdda.vardist(
  formula,
  pred = NULL,
  mod = NULL,
  data = list(),
  modval = NULL,
  B = 200,
  boot.type = "bca",
  conf.level = 0.95
)

## S3 method for class 'cdda.vardist'
print(x, ...)

## S3 method for class 'cdda.vardist'
plot(x, stat = NULL, ylim = NULL, ...)

## S3 method for class 'cdda.vardist'
summary(object, skew = TRUE, coskew = FALSE, kurt = TRUE, cokurt = FALSE, ...)

Arguments

formula

Symbolic formula of the model to be tested or an lm object.

pred

A character indicating the variable name of the predictor which serves as the outcome in the alternative model.

mod

A character indicating the variable name of the moderator.

data

A required data frame containing the variables in the model.

modval

Characters or a numeric sequence specifying the moderator values used in post-hoc probing. Possible characters include c("mean", "median", "JN"). modval = "mean" tests the interaction effect at the moderator values M - 1SD, M, and M + 1SD; modval = "median" uses Q1, Md, and Q3. The Johnson-Neyman approach is applied when modval = "JN" with conditional effects being evaluated at the boundary values of the significance regions. When a numeric sequence is specified, the pick-a-point approach is used for the selected numeric values.

B

Number of bootstrap samples.

boot.type

A character indicating the type of bootstrap confidence intervals. Must be one of c("perc", "bca"). boot.type = "bca" is the default.

conf.level

Confidence level for bootstrap confidence intervals.

x

An object of class cdda.vardist when using print or plot.

...

Additional arguments to be passed to the function.

stat

A character indicating the statistic to be plotted. Default is "rhs", with options c("coskew", "cokurt", "rhs", "rcc", "rtanh").

ylim

A numeric vector of length 2 indicating the y-axis limits for plot. If NULL, limits are set automatically.

object

An object of class cdda.vardist when using summary.

skew

A logical value indicating whether skewness differences and separate D'Agostino skewness tests should be returned when using summary. Default is TRUE.

coskew

A logical value indicating whether co-skewness differences should be returned when using summary. Default is FALSE.

kurt

A logical value indicating whether excess kurtosis differences and Anscombe-Glynn kurtosis tests should be returned when using summary. Default is TRUE.

cokurt

A logical value indicating whether co-kurtosis differences should be returned when using summary. Default is FALSE.

Value

An object of class cdda.vardist containing the results of conditional direction dependence tests of variable distributions.

References

Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.

See Also

dda.vardist for an unconditional version.

Examples

set.seed(321)
n <- 700

## --- generate moderator
z <- sort(rnorm(n))
z1 <- z[z <= 0]
z2 <- z[z > 0]

## --- x -> y when z <= 0
x1 <- rchisq(length(z1), df = 4) - 4
e1 <- rchisq(length(z1), df = 3) - 3
y1 <- 0.5 * x1 + e1

## --- y -> x when z > 0
y2 <- rchisq(length(z2), df = 4) - 4
e2 <- rchisq(length(z2), df = 3) - 3
x2 <- 0.25 * y2 + e2

y <- c(y1, y2); x <- c(x1, x2)
d <- data.frame(x, y, z)
m <- lm(y ~ x * z, data = d)

result <- cdda.vardist(m, pred = "x", mod = "z", B = 100,
                       boot.type = "perc", modval = c(-1, 1),
                       data = d)

print(result)
plot(result, stat = "rtanh", ylim = c(-0.05, 0.05))
summary(result, skew = FALSE, kurt = FALSE, coskew = TRUE)

## Not run: 
# --- Larger bootstrap example
result <- cdda.vardist(m, pred = "x", mod = "z", B = 2000,
  modval = c(-1, 1), data = d)

print(result)
plot(result, stat = "rtanh", ylim = c(-0.05, 0.05))
summary(result, skew = FALSE, kurt = FALSE, coskew = TRUE)

## End(Not run)

Direction Dependence Analysis: Independence Properties

Description

dda.indep evaluates asymmetries of predictor-error independence of causally competing models (y ~ x vs. x ~ y). print returns DDA test statistics associated with dda.indep objects.

Usage

dda.indep(
  formula,
  pred = NULL,
  data = list(),
  nlfun = NULL,
  hetero = FALSE,
  hsic.method = "gamma",
  diff = FALSE,
  B = 200,
  boot.type = "perc",
  conf.level = 0.95,
  parallelize = FALSE,
  cores = 1
)

## S3 method for class 'dda.indep'
print(x, ...)

Arguments

formula

Symbolic formula of the model to be tested or an lm object.

pred

A character indicating the variable name of the predictor which serves as the outcome in the alternative model.

data

An optional data frame containing the variables in the model (by default variables are taken from the environment which dda.indep is called from).

nlfun

Either a numeric value or a function of .Primitive type used for non-linear correlation tests. When nlfun is numeric the value is used in a power transformation.

hetero

A logical value indicating whether separate homoscedasticity tests (i.e., standard and robust Breusch-Pagan tests) should be computed.

hsic.method

A character indicating the inference method for the Hilbert-Schmidt Independence Criterion (HSIC). Must be one of c("gamma", "eigenvalue", "bootstrap", "permutation"). hsic.method = "gamma" is the default.

diff

A logical value indicating whether differences in HSIC, Distance Correlation (dCor), and Mutual Information (MI) values should be computed. Bootstrap confidence intervals are computed using B bootstrap samples.

B

Number of permutation replicates for separate dCor tests, and number of resamples when hsic.method = c("bootstrap", "permutation") or diff = TRUE.

boot.type

A character indicating the type of bootstrap confidence intervals. Must be one of c("perc", "bca"). boot.type = "perc" is the default.

conf.level

Confidence level for bootstrap confidence intervals.

parallelize

A logical value indicating whether bootstrapping is performed on multiple cores. Only used if diff = TRUE.

cores

A numeric value indicating the number of cores. Only used if parallelize = TRUE.

x

An object of class dda.indep when using print.

...

Additional arguments to be passed to the function.

Value

An object of class dda.indep containing the results of independence tests of Direction Dependence Analysis.

References

Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.

See Also

cdda.indep for a conditional version.

Examples

set.seed(123)
n <- 500
x <- rchisq(n, df = 4) - 4
e <- rchisq(n, df = 3) - 3
y <- 0.5 * x + e
d <- data.frame(x, y)

## --- quick example (small B for speed)
result <- dda.indep(y ~ x, pred = "x", data = d,
  nlfun = 2, B = 10, hetero = TRUE, diff = TRUE,
  parallelize = FALSE, cores = 2)

print(result)

## Not run: 
# --- Larger bootstrap example
result <- dda.indep(y ~ x, pred = "x", data = d,
  nlfun = 2, B = 2000, hetero = TRUE, diff = TRUE,
  parallelize = FALSE, cores = 2)

print(result)

## End(Not run)

Direction Dependence Analysis: Residual Distributions

Description

dda.resdist evaluates patterns of asymmetry of error distributions of causally competing models (y ~ x vs. x ~ y). print returns DDA test statistics associated with dda.resdist objects.

Usage

dda.resdist(
  formula,
  pred = NULL,
  data = list(),
  B = 200,
  boot.type = "perc",
  prob.trans = FALSE,
  conf.level = 0.95
)

## S3 method for class 'dda.resdist'
print(x, ...)

Arguments

formula

Symbolic formula of the target model to be tested or an lm object.

pred

Variable name of the predictor which serves as the outcome in the alternative model.

data

An optional data frame containing the variables in the model (by default variables are taken from the environment which dda.resdist is called from).

B

Number of bootstrap samples.

boot.type

A character indicating the type of bootstrap confidence intervals. Must be one of c("perc", "bca"). boot.type = "perc" is the default.

prob.trans

A logical value indicating whether a probability integral transformation should be performed prior to computation of skewness and kurtosis tests.

conf.level

Confidence level for bootstrap confidence intervals.

x

An object of class dda.resdist when using print.

...

Additional arguments to be passed to the method.

Value

An object of class dda.resdist containing the results of direction dependence tests of error distributions.

References

Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.

Examples

set.seed(123)
n <- 500
x <- rchisq(n, df = 4) - 4
e <- rchisq(n, df = 3) - 3
y <- 0.5 * x + e
d <- data.frame(x, y)

result <- dda.resdist(y ~ x, pred = "x", data = d,
            B = 50, conf.level = 0.90, prob.trans = TRUE)

print(result)

## Not run: 
# --- Larger bootstrap example
result <- dda.resdist(y ~ x, pred = "x", data = d,
                      B = 2000, conf.level = 0.90)

print(result)

## End(Not run)

Direction Dependence Analysis: Variable Distributions

Description

dda.vardist evaluates patterns of asymmetry of variable distributions for causally competing models (y ~ x vs. x ~ y). print returns DDA test statistics associated with dda.vardist objects.

Usage

dda.vardist(
  formula,
  pred = NULL,
  data = list(),
  B = 200,
  boot.type = "perc",
  conf.level = 0.95
)

## S3 method for class 'dda.vardist'
print(x, ...)

Arguments

formula

Symbolic formula of the model to be tested or an lm object.

pred

Variable name of the predictor which serves as the outcome in the alternative model.

data

An optional data frame containing the variables in the model (by default variables are taken from the environment which dda.vardist is called from).

B

Number of bootstrap samples.

boot.type

A character indicating the type of bootstrap confidence intervals. Must be one of c("perc", "bca"). boot.type = "perc" is the default.

conf.level

Confidence level for bootstrap confidence intervals.

x

An object of class dda.vardist when using print.

...

Additional arguments to be passed to the function.

Value

An object of class dda.vardist containing the results of direction dependence tests of variable distributions.

References

Wiedermann, W., & von Eye, A. (2025). Direction Dependence Analysis: Foundations and Statistical Methods. Cambridge, UK: Cambridge University Press.

See Also

cdda.vardist for a conditional version.

Examples

set.seed(123)
n <- 500
x <- rchisq(n, df = 4) - 4
e <- rchisq(n, df = 3) - 3
y <- 0.5 * x + e
d <- data.frame(x, y)

result <- dda.vardist(y ~ x, pred = "x", data = d,
                      boot.type = "perc", B = 100)

print(result)

## Not run: 
# --- Larger bootstrap example
result <- dda.vardist(y ~ x, pred = "x", data = d, B = 2000)

print(result)

## End(Not run)

Compute the Empirical HSIC

Description

hsic computes the empirical Hilbert-Schmidt Independence Criterion between two variables X and Y using the biased V-statistic (1/n2)tr(K~XK~Y)(1/n^2) \, \mathrm{tr}(\tilde{K}_X \tilde{K}_Y). All kernel and centering computations are performed in C++.

Usage

hsic(
  x,
  y,
  kernel_x = "gaussian",
  kernel_y = kernel_x,
  bandwidth_x = NULL,
  bandwidth_y = NULL,
  degree = 2L,
  coef0 = 1
)

Arguments

x

A numeric vector of length n or matrix (n x p).

y

A numeric vector of length n or matrix (n x q).

kernel_x

Kernel for X. One of c("gaussian", "laplace", "linear", "polynomial"). Default is "gaussian".

kernel_y

Kernel for Y. Defaults to kernel_x.

bandwidth_x

Bandwidth for the X kernel. The median heuristic (median_bandwidth) is always the default and is applied when bandwidth_x = NULL (default) or bandwidth_x = "median". Alternatively, a strictly positive numeric value is used directly as σ2\sigma^2 for the Gaussian kernel K(x,y)=exp(xy2/(2σ2))K(x,y) = \exp(-\|x-y\|^2 / (2\sigma^2)).

bandwidth_y

Bandwidth for the Y kernel. Same options as bandwidth_x; the median heuristic is the default.

degree

Integer degree for the polynomial kernel. Default 2.

coef0

Constant term for the polynomial kernel. Default 1.

Details

The Gaussian kernel uses K(x,y)=exp(xy2/(2σ2))K(x,y) = \exp(-\|x-y\|^2 / (2\sigma^2)) where bandwidth = σ2\sigma^2. The median heuristic sets σ2\sigma^2 to the median of all strictly positive squared pairwise distances, matching the convention in the dHSIC package.

Value

A single non-negative numeric value: the raw HSIC estimate (1/n2)tr(K~XK~Y)(1/n^2) \, \mathrm{tr}(\tilde{K}_X \tilde{K}_Y). A value of zero (with a characteristic kernel) implies independence.

References

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20.

Peters, J., Pfister, N., & Mooij, J. M. (2022). dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion. R package version 2.1. https://CRAN.R-project.org/package=dHSIC

See Also

hsic.test, median_bandwidth

Examples

set.seed(12)
x <- rnorm(100)
hsic(x, rnorm(100))
hsic(x, x + rnorm(100, sd = 0.5))
hsic(x, rnorm(100), bandwidth_x = 1, bandwidth_y = 1)

HSIC Independence Test

Description

hsic.test tests whether X and Y are independent using the Hilbert-Schmidt Independence Criterion. The test statistic is n×HSIC^n \times \widehat{\mathrm{HSIC}}, labelled "HSIC", and is consistent across all four inference methods.

Available null-distribution methods:

"gamma"

Fits a Gamma distribution to the first two null moments of HSIC analytically (Peters, Pfister & Mooij, 2022). No resampling required.

"permutation"

Permutes the row index of X to simulate the null distribution entirely in C++.

"eigenvalue"

Derives the null distribution from the eigenvalue spectrum of the centered kernel matrices (Zhang, 2011). More accurate in small samples; uses B Monte Carlo draws. Reports n×HSIC^n \times \widehat{\mathrm{HSIC}} with p-value from the spectral null.

"bootstrap"

Independently resamples rows of the kernel matrices with replacement in C++ (Peters, Pfister & Mooij, 2022).

Usage

hsic.test(
  x,
  y,
  method = c("gamma", "permutation", "eigenvalue", "bootstrap"),
  kernel_x = "gaussian",
  kernel_y = kernel_x,
  bandwidth_x = NULL,
  bandwidth_y = NULL,
  degree = 2L,
  coef0 = 1,
  B = 1000L
)

Arguments

x

A numeric vector of length n or matrix (n x p).

y

A numeric vector of length n or matrix (n x q).

method

Inference method for the null distribution. One of c("gamma", "permutation", "eigenvalue", "bootstrap"). Default is "gamma".

kernel_x

Kernel for X. One of c("gaussian", "laplace", "linear", "polynomial"). Default is "gaussian".

kernel_y

Kernel for Y. Defaults to kernel_x.

bandwidth_x

Bandwidth (σ2\sigma^2) for the X kernel. The median heuristic is always the default and is applied when bandwidth_x = NULL. A strictly positive numeric value is used directly.

bandwidth_y

Bandwidth for the Y kernel. Same options as bandwidth_x; the median heuristic is the default.

degree

Integer degree for the polynomial kernel. Default 2.

coef0

Constant term for the polynomial kernel. Default 1.

B

Number of permutation or bootstrap replicates, or Monte Carlo draws for method = "eigenvalue". Ignored for method = "gamma". Default is 1000.

Details

All four methods report n×HSIC^n \times \widehat{\mathrm{HSIC}} as the test statistic. Permutation and bootstrap p-values use the Laplace correction (#{TbTobs}+1)/(B+1)(\#\{T_b \geq T_{\mathrm{obs}}\} + 1) / (B + 1).

Value

An object of class "htest" with components:

statistic

The test statistic n×HSIC^n \times \widehat{\mathrm{HSIC}}, labelled "HSIC".

p.value

P-value for the test of independence.

bandwidths

Resolved bandwidths c(x, y).

References

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20.

Peters, J., Pfister, N., & Mooij, J. M. (2022). dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion. R package version 2.1. https://CRAN.R-project.org/package=dHSIC

Zhang, K., Peters, J., Janzing, D., & Scholkopf, B. (2011). Kernel-based conditional independence test and application in causal discovery. In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2011) (pp. 804-813).

See Also

hsic, median_bandwidth

Examples

set.seed(7)
n <- 80
x <- rnorm(n)
hsic.test(x, rnorm(n), method = "gamma")
hsic.test(x, rnorm(n), method = "permutation", B = 499)
hsic.test(x, x + rnorm(n), method = "permutation", B = 499)

Median Heuristic Bandwidth

Description

Returns the median of all strictly positive squared pairwise Euclidean distances, interpreted as the variance parameter σ2\sigma^2 for the Gaussian kernel K(x,y)=exp(xy2/(2σ2))K(x,y) = \exp(-\|x-y\|^2 / (2\sigma^2)). This is the same convention used in the dHSIC source code (Peters et al., 2022).

Usage

median_bandwidth(x)

Arguments

x

A numeric vector or matrix of observations (n rows).

Value

A single strictly positive numeric: the bandwidth σ2\sigma^2, ready to pass as bandwidth_x or bandwidth_y. Returns 1 when all pairwise distances are zero.

References

Gretton, A., Fukumizu, K., Teo, C. H., Song, L., Scholkopf, B., & Smola, A. J. (2008). A kernel statistical test of independence. Advances in Neural Information Processing Systems, 20.

Peters, J., Pfister, N., & Mooij, J. M. (2022). dHSIC: Independence Testing via Hilbert Schmidt Independence Criterion. R package version 2.1.

See Also

hsic.test

Examples

x <- rnorm(80)
median_bandwidth(x)