Multivariate Distributions

Multivariate probability distributions and correlation utilities.

Distribution Objects

File containing the copulAX implementation of the multivariate normal distribution.

class copulax._src.multivariate.mvt_normal.MvtNormal(name='Mvt-Normal', *, mu=None, sigma=None)[source]

The multivariate normal / Gaussian distribution is a generalization of the univariate normal distribution to d > 1 dimensions.

https://en.wikipedia.org/wiki/Multivariate_normal_distribution

\[f(x|\mu, \Sigma) = \frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}} \exp\left(-\frac{1}{2}(x - \mu)^T \Sigma^{-1} (x - \mu)\right)\]

where \(\mu\) is the mean vector and \(\Sigma\) the variance-covariance matrix of the data distribution.

Parameters:

mu (Array)
sigma (Array)

mu: Array = None

sigma: Array = None

example_params(dim=3, *args, **kwargs)[source]

Example parameters for the multivariate normal distribution.

This is a two parameter family, defined by the mean / location vector mu and the variance-covariance matrix sigma.

Parameters:: dim (int) – int, number of dimensions of the multivariate normal distribution. Default is 3.
Return type:: dict

support(params=None)[source]

Return the support of the distribution: (-inf, inf) per dimension.

Return type:: Array
Parameters:: params (dict)

logpdf(x, params=None)[source]

Log-probability density function of the multivariate normal.

Parameters:

x (Union[Array, ndarray, bool, number, bool, int, float, complex]) – Input data of shape (n, d).
params (dict) – Distribution parameters with keys ‘mu’ and ‘sigma’.

Return type:

Array

Returns:

Array of log-density values with shape (n, 1).

rvs(size, params=None, key=None)[source]

Generate random samples from the multivariate normal.

Parameters:

size (int) – Number of samples to draw.
params (dict) – Distribution parameters.
key – JAX random key.

Return type:

Array

Returns:

Array of shape (size, d).

stats(params=None)[source]

Compute distribution statistics (mean, median, mode, cov, skewness).

Return type:: dict
Parameters:: params (dict)

fit(x, sigma_method='pearson', *args, name=None, **kwargs)[source]

Fit the multivariate normal to data via closed-form MLE: \(\hat\mu = \operatorname{mean}(x)\) (row-wise), and \(\hat\Sigma\) via copulax.multivariate.cov() using the estimator chosen by sigma_method.

Note

If you intend to jit wrap this function, ensure that sigma_method is a static argument.

Parameters:

x (Union[Array, ndarray, bool, number, bool, int, float, complex]) – Input data of shape (n, d).
sigma_method (str) – Covariance estimator name forwarded to copulax.multivariate.cov() (default 'pearson').
name (str) – Optional custom name for the fitted instance.

Returns:

A fitted MvtNormal instance.

Return type:

dict

File containing the copulAX implementation of the multivariate student-t distribution.

class copulax._src.multivariate.mvt_student_t.MvtStudentT(name='Mvt-Student-T', *, nu=None, mu=None, sigma=None)[source]

The multivariate student-t distribution is a generalization of the univariate student-t distribution to d > 1 dimensions.

https://en.wikipedia.org/wiki/Multivariate_t-distribution

\(\mu\) is the mean vector and \(\sigma\) the shape matrix, which for this parameterization is not the variance-covariance matrix of the data distribution. \(\nu\) is the degrees of freedom parameter.

Parameters:

nu (Array)
mu (Array)
sigma (Array)

nu: Array = None

mu: Array = None

sigma: Array = None

example_params(dim=3, *args, **kwargs)[source]

Example parameters for the multivariate student-t distribution.

This is a three parameter family, defined by the degrees of freedom scalar nu, the mean / location vector mu and the shape matrix sigma.

Parameters:: dim (int) – int, number of dimensions of the multivariate student-t distribution. Default is 3.
Return type:: dict

support(params=None)[source]

Return the support: (-inf, inf) per dimension.

Return type:: Array
Parameters:: params (dict)

rvs(size, params=None, key=None)[source]

Generate random samples via the normal-variance mixture.

Sampling uses an inverse-gamma mixing variable W and the base class normal-variance mixture sampler.

Parameters:

size (int) – Number of samples to draw.
params (dict) – Distribution parameters.
key (Union[Array, ndarray, bool, number, bool, int, float, complex]) – JAX random key.

Return type:

Array

Returns:

Array of shape (size, d).

stats(params=None)[source]

Compute distribution statistics (mean, median, mode, cov, skewness).

Return type:: dict
Parameters:: params (dict)

File containing the copulAX implementation of the multivariate skewed-T distribution.

class copulax._src.multivariate.mvt_skewed_t.MvtSkewedT(name='Mvt-Skewed-T', *, nu=None, mu=None, gamma=None, sigma=None)[source]

The multivariate skewed-T distribution is a generalization of the univariate skewed-T distribution to d > 1 dimensions, which itself is a generalization of the student-t distribution which allows for skewness. It can also be expressed as a limiting case of the multivariate generalized hyperbolic distribution (GH) when phi -> 0 in addition to lamb = -0.5*chi.

We use the 4 parameter McNeil et al (2005) specification of the distribution.

Parameters:

nu (Array)
mu (Array)
gamma (Array)
sigma (Array)

nu: Array = None

mu: Array = None

gamma: Array = None

sigma: Array = None

example_params(dim=3, *args, **kwargs)[source]

Example parameters for the multivariate skewed-t distribution.

Parameters:: dim (int) – Number of dimensions. Default is 3.

support(params=None)[source]

Return the support: (-inf, inf) per dimension.

Return type:: Array
Parameters:: params (dict)

rvs(size, params=None, key=None)[source]

Generate random samples via the normal-variance mixture.

Parameters:

size (int) – Number of samples to draw.
params (dict) – Distribution parameters.
key (Union[Array, ndarray, bool, number, bool, int, float, complex]) – JAX random key.

Return type:

Array

Returns:

Array of shape (size, d).

stats(params=None)[source]

Compute distribution statistics using inverse-gamma mixing moments.

Return type:: dict
Parameters:: params (dict)

fit(x, method='em', cov_method='pearson', lr=0.1, maxiter=100, name=None)[source]

Fit the multivariate skewed-t distribution to data.

Note

If you intend to jit wrap this function, ensure that method and cov_method are static arguments.

Parameters:

x (Union[Array, ndarray, bool, number, bool, int, float, complex]) – Input data of shape (n, d).
method (str) – Fitting method. One of: 'em' — ECME algorithm (McNeil et al. 2005, Section 3.2.4); updates (mu, gamma, Sigma) in closed form via E-step sufficient statistics and nu via gradient descent; generally more robust and faster-converging than LDMLE (default); 'ldmle' — low-dimensional MLE via projected ADAM gradient descent, optimising (nu, gamma) while deriving (mu, Sigma) analytically from sample moments.
cov_method (str) – Covariance estimator used for initialisation (both methods) and throughout the LDMLE path. Forwarded to copulax.multivariate.cov().
lr (float) – Learning rate. Default 0.1 is tuned for EM; LDMLE may require a lower rate.
maxiter (int) – Maximum number of iterations.
name (str) – Optional custom name for the fitted instance.

Returns:

A fitted MvtSkewedT instance.

Return type:

MvtSkewedT

Raises:

ValueError – If method is not one of the accepted strings listed above.

File containing the copulAX implementation of the multivariate generalized hyperbolic (GH) distribution.

class copulax._src.multivariate.mvt_gh.MvtGH(name='Mvt-GH', *, lamb=None, chi=None, psi=None, mu=None, gamma=None, sigma=None)[source]

The multivariate generalized hyperbolic (GH) distribution is a generalization of the univariate GH distribution to d > 1 dimensions. This is a flexible, continuous 6-parameter family of distributions that can model a variety of data behaviors, including heavy tails and skewness. It contains a number of popular distributions as special cases, including the multivariate normal, multivariate student-t and multivariate skewed-T distributions.

We adopt the parameterization used by McNeil et al. (2005)

Parameters:

lamb (Array)
chi (Array)
psi (Array)
mu (Array)
gamma (Array)
sigma (Array)

lamb: Array = None

chi: Array = None

psi: Array = None

mu: Array = None

gamma: Array = None

sigma: Array = None

example_params(dim=3, *args, **kwargs)[source]

Example parameters for the multivariate GH distribution.

This is a six parameter family, defined by the scalar parameters lamb, chi, psi, the location vector mu, the skewness vector gamma and the shape matrix sigma.

Parameters:: dim (int) – int, number of dimensions of the multivariate GH distribution. Default is 3.
Return type:: dict

support(params=None)[source]

Return the support: (-inf, inf) per dimension.

Return type:: Array
Parameters:: params (dict)

rvs(size, params=None, key=None)[source]

Generate random samples via the GIG normal-variance mixture.

Parameters:

size (int) – Number of samples to draw.
params (dict) – Distribution parameters.
key (Union[Array, ndarray, bool, number, bool, int, float, complex]) – JAX random key.

Return type:

Array

Returns:

Array of shape (size, d).

stats(params=None)[source]

Compute distribution statistics using GIG mixing moments.

Return type:: dict
Parameters:: params (dict)

fit(x, method='em', cov_method='pearson', lr=0.1, maxiter=100, name=None)[source]

Fit the multivariate GH distribution to data.

Note

If you intend to jit wrap this function, ensure that method and cov_method are static arguments.

Parameters:

x (Union[Array, ndarray, bool, number, bool, int, float, complex]) – Input data of shape (n, d).
method (str) – Fitting method. One of: 'em' — ECME algorithm (McNeil et al. 2005, Section 3.4.2); updates (mu, gamma, Sigma) in closed form via E-step sufficient statistics and (lamb, chi, psi) via gradient descent; generally more robust and faster-converging than LDMLE (default); 'ldmle' — low-dimensional MLE via projected ADAM gradient descent, optimising (lamb, chi, psi, gamma) while deriving (mu, Sigma) analytically from sample moments.
cov_method (str) – Covariance estimator used for initialisation (both methods) and throughout the LDMLE path. Forwarded to copulax.multivariate.cov().
lr (float) – Learning rate. Default 0.1 is tuned for EM; LDMLE may require a lower rate.
maxiter (int) – Maximum number of iterations.
name (str) – Optional custom name for the fitted instance.

Returns:

A fitted MvtGH instance.

Return type:

MvtGH

Raises:

ValueError – If method is not one of the accepted strings listed above.

Correlation and Covariance

copulax._src.multivariate._shape.corr(x, method='pearson', **kwargs)[source]

Compute the correlation matrix of the input data.

Returns a symmetric, positive semi-definite matrix with unit diagonal and entries in [-1, 1].

Four base estimators are available, each optionally combined with one of two eigenvalue-denoising techniques:

Base estimators:

'pearson' — standard linear (Pearson) correlation.
'spearman' — Spearman rank correlation (Pearson applied to ranks).
'kendall' — Kendall’s tau, a concordance-based rank correlation. More robust to outliers than Pearson/Spearman.
'pp_kendall' — pseudo-Pearson Kendall: converts Kendall’s tau to Pearson via the elliptical identity \(\rho = \sin(\pi \tau / 2)\). Useful when variances/covariances are undefined or infinite (e.g. heavy- tailed elliptical distributions).

Denoising variants (prefix + base estimator name):

'rm_*' — Rousseeuw-Molenberghs (1993) denoising. Clamps non-positive eigenvalues to delta (default 1e-5), then rescales to restore unit diagonal. Guarantees positive semi- definiteness. Use when the raw estimator may produce a non-PSD matrix (e.g. Kendall/Spearman on small samples).
'laloux_*' — Laloux et al. (1999) random-matrix-theory denoising. Eigenvalues inside the Marchenko-Pastur noise bulk are replaced by their mean; signal eigenvalues above the bulk upper bound \((1 + \sqrt{d/n})^2\) are preserved. Use when n/d is moderate and you want to separate signal from sampling noise.

Both denoising methods accept a delta keyword argument (default 1e-5) controlling the eigenvalue floor.

Parameters:

x (Union[Array, ndarray, bool, number, bool, int, float, complex]) – Input data of shape (n, d) where n is the number of observations and d is the number of variables.
method (str) – Correlation method. One of 'pearson', 'spearman', 'kendall', 'pp_kendall', 'rm_pearson', 'rm_spearman', 'rm_kendall', 'rm_pp_kendall', 'laloux_pearson', 'laloux_spearman', 'laloux_kendall', 'laloux_pp_kendall'.
**kwargs – Passed to the underlying method (e.g. delta for denoised variants).

Returns:

Correlation matrix of shape (d, d).

Return type:

Array

Raises:

ValueError – If method is not a recognised method name.

Note

If you intend to jit wrap this function, ensure that method is a static argument.

copulax._src.multivariate._shape.cov(x, method='pearson', **kwargs)[source]

Compute the covariance matrix of the input data.

Constructs the covariance matrix as \(\Sigma = D \, R \, D\) where \(R\) is the correlation matrix from corr() and \(D = \text{diag}(\hat\sigma)\) is the diagonal matrix of sample standard deviations (ddof=1).

When method='pearson' this is equivalent to the standard sample covariance matrix (i.e. numpy.cov(x, rowvar=False)). For non-Pearson methods the result is a pseudo-covariance: sample variances combined with an alternative correlation estimator.

Parameters:

x (Union[Array, ndarray, bool, number, bool, int, float, complex]) – Input data of shape (n, d) where n is the number of observations and d is the number of variables.
method (str) – Correlation method passed to corr(). See corr() for available options.
**kwargs – Passed to corr() (e.g. delta for denoised variants).

Returns:

Covariance matrix of shape (d, d).

Return type:

Array

Raises:

ValueError – If method is not a recognised method name.

Note

If you intend to jit wrap this function, ensure that method is a static argument.

copulax._src.multivariate._shape.random_correlation(size, key=None)[source]

Generate a random positive-definite correlation matrix.

Produces a symmetric matrix with unit diagonal, entries in [-1, 1], and strictly positive eigenvalues. Useful for testing, simulation, and initialisation of multivariate models.

Uses the factors method: \(C = W W^\top + D\) where \(W \sim \text{Uniform}(-1, 1)^{d \times d}\) and \(D\) is diagonal with entries in [0, 1]. The PSD matrix \(C\) is then rescaled to a correlation matrix via \(R_{ij} = C_{ij} / \sqrt{C_{ii} C_{jj}}\).

Parameters:

size (int) – Dimension d of the (d, d) output matrix.
key (Array) – JAX PRNG key. If None, a key is generated automatically.

Returns:

Random correlation matrix of shape (size, size).

Return type:

Array

Note

If you intend to jit wrap this function, ensure that size is a static argument.

copulax._src.multivariate._shape.random_covariance(vars, key=None)[source]

Generate a random positive-definite covariance matrix with prescribed variances.

Constructs \(\Sigma = D \, R \, D\) where \(R\) is a random correlation matrix from random_correlation() and \(D = \text{diag}(\sqrt{\text{vars}})\). The diagonal of the output equals the input vars.

Parameters:

vars (Array) – Variances of each variable. A 1-d array of length d; the output shape will be (d, d).
key (Array) – JAX PRNG key. If None, a key is generated automatically.

Returns:

Random covariance matrix of shape (d, d).

Return type:

Array