Title: | Robust Estimation of Copulas by Maximum Mean Discrepancy |
---|---|
Description: | Provides functions for the robust estimation of parametric families of copulas using minimization of the Maximum Mean Discrepancy, following the article Alquier, Chérief-Abdellatif, Derumigny and Fermanian (2022) <doi:10.1080/01621459.2021.2024836>. |
Authors: | Alexis Derumigny [aut, cre] |
Maintainer: | Alexis Derumigny <[email protected]> |
License: | GPL-3 |
Version: | 0.2.1 |
Built: | 2025-02-09 05:30:45 UTC |
Source: | https://github.com/alexisderumigny/mmdcopula |
Confidence intervals for the estimated parameter of a bivariate parametric copula using MMD estimation
BiCopConfIntMMD( x1, x2, family, nResampling = 100, subsamplingSize = length(x1), corrSubSampling = TRUE, level = 0.95, ... )
BiCopConfIntMMD( x1, x2, family, nResampling = 100, subsamplingSize = length(x1), corrSubSampling = TRUE, level = 0.95, ... )
x1 |
vector of observations of the first coordinate. |
x2 |
vector of observations of the second coordinate. |
family |
parametric family of copulas. Supported families are:
|
nResampling |
number of resampling times. |
subsamplingSize |
size of the subsample.
By default it is |
corrSubSampling |
this parameter is only used for subsampling-based confidence intervals.
If |
level |
the nominal confidence level. |
... |
other parameters to be given to |
a list with the confidence intervals CI.Tau for Kendall's tau and CI.Par for the corresponding parameter.
Alquier, P., Chérief-Abdellatif, B.-E., Derumigny, A., and Fermanian, J.D. (2022). Estimation of copulas via Maximum Mean Discrepancy. Journal of the American Statistical Association, doi:10.1080/01621459.2021.2024836.
Kojadinovic I., and Stemikovskaya, K. (2019) Subsampling (weighted smooth) empirical copula processes. Journal of Multivariate Analysis, 173, 704-723, doi:10.1016/j.jmva.2019.05.007.
data = VineCopula::BiCopSim(N = 50, family = 1, par = 0.3) result = BiCopConfIntMMD(x1 = data[,1], x2 = data[,2], family = 1, nResampling = 2, subsamplingSize = 10, niter = 10) data_ = VineCopula::BiCopSim(N = 1000, family = 1, par = 0.3) result_ = BiCopConfIntMMD(x1 = data_[,1], x2 = data_[,2], family = 1) result_$CI.Tau result_$CI.Par
data = VineCopula::BiCopSim(N = 50, family = 1, par = 0.3) result = BiCopConfIntMMD(x1 = data[,1], x2 = data[,2], family = 1, nResampling = 2, subsamplingSize = 10, niter = 10) data_ = VineCopula::BiCopSim(N = 1000, family = 1, par = 0.3) result_ = BiCopConfIntMMD(x1 = data_[,1], x2 = data_[,2], family = 1) result_$CI.Tau result_$CI.Par
Estimation of Marshall-Olkin copulas
BiCopEst.MO( u1, u2, method, par.start = 0.5, kernel = "gaussian.Phi", gamma = 0.95, alpha = 1, niter = 100, C_eta = 1, ndrawings = 10, naveraging = 1 )
BiCopEst.MO( u1, u2, method, par.start = 0.5, kernel = "gaussian.Phi", gamma = 0.95, alpha = 1, niter = 100, C_eta = 1, ndrawings = 10, naveraging = 1 )
u1 |
vector of observations of the first coordinate, in |
u2 |
vector of observations of the second coordinate, in |
method |
a character giving the name of the estimation method, among:
|
par.start |
starting parameter of the gradient descent.
(only used for |
kernel |
the kernel used in the MMD distance
(only used for
Each of these names can receive the suffix |
gamma |
parameter |
alpha |
parameter |
niter |
the stochastic gradient algorithm is composed of two phases:
a first "burn-in" phase and a second "averaging" phase.
If |
C_eta |
a multiplicative constant controlling for the size of the gradient descent step.
The step size is then computed as |
ndrawings |
number of replicas of the stochastic estimate of the gradient
drawn at each step. The gradient is computed using the average of these replicas.
(only used for |
naveraging |
number of full run of the stochastic gradient algorithm
that are averaged at the end to give the final estimated parameter.
(only used for |
the estimated parameter (alpha
) of the Marshall-Olkin copula.
Alquier, P., Chérief-Abdellatif, B.-E., Derumigny, A., and Fermanian, J.D. (2022). Estimation of copulas via Maximum Mean Discrepancy. Journal of the American Statistical Association, doi:10.1080/01621459.2021.2024836.
BiCopSim.MO
for the estimation of
Marshall-Olkin copulas.
BiCopEstMMD
for the estimation of other parametric copula families by MMD.
U <- BiCopSim.MO(n = 1000, alpha = 0.2) estimatedPar <- BiCopEst.MO(u1 = U[,1], u2 = U[,2], method = "MMD", niter = 1, ndrawings = 1) estimatedPar <- BiCopEst.MO(u1 = U[,1], u2 = U[,2], method = "MMD")
U <- BiCopSim.MO(n = 1000, alpha = 0.2) estimatedPar <- BiCopEst.MO(u1 = U[,1], u2 = U[,2], method = "MMD", niter = 1, ndrawings = 1) estimatedPar <- BiCopEst.MO(u1 = U[,1], u2 = U[,2], method = "MMD")
This function uses computes the MMD-estimator of a bivariate copula family.
This computation is done through a stochastic gradient algorithm,
that is itself computed by the function BiCopGradMMD()
.
The main arguments are the two vectors of observations, and the copula family.
The bidimensional copula families are indexed in the same way as
in VineCopula::BiCop()
(which computes the MLE estimator).
BiCopEstMMD( u1, u2, family, tau = NULL, par = NULL, par2 = NULL, kernel = "gaussian", gamma = "default", alpha = 1, niter = 100, C_eta = 1, epsilon = 1e-04, method = "QMCV", quasiRNG = "sobol", ndrawings = 10 )
BiCopEstMMD( u1, u2, family, tau = NULL, par = NULL, par2 = NULL, kernel = "gaussian", gamma = "default", alpha = 1, niter = 100, C_eta = 1, epsilon = 1e-04, method = "QMCV", quasiRNG = "sobol", ndrawings = 10 )
u1 |
vector of observations of the first coordinate, in |
u2 |
vector of observations of the second coordinate, in |
family |
the chosen family of copulas
(see the documentation of the class |
tau |
the copula family can be parametrized by the parameter |
par |
if different from |
par2 |
initial value for the second parameter, if any. (Works only for Student copula). |
kernel |
the kernel used in the MMD distance:
it can be a function taking in parameter
Each of these names can receive the suffix |
gamma |
parameter |
alpha |
parameter |
niter |
the stochastic gradient algorithm is composed of two phases:
a first "burn-in" phase and a second "averaging" phase.
If |
C_eta |
a multiplicative constant controlling for the size of the gradient descent step.
The step size is then computed as |
epsilon |
the differential of |
method |
the method of computing the stochastic gradient:
|
quasiRNG |
a function giving the quasi-random points in |
ndrawings |
number of replicas of the stochastic estimate of the gradient drawn at each step. The gradient is computed using the average of these replicas. |
an object of class VineCopula::BiCop()
containing the estimated copula.
Alquier, P., Chérief-Abdellatif, B.-E., Derumigny, A., and Fermanian, J.D. (2022). Estimation of copulas via Maximum Mean Discrepancy. Journal of the American Statistical Association, doi:10.1080/01621459.2021.2024836.
VineCopula::BiCopEst()
for other methods of estimation
such as Maximum Likelihood Estimation or Inversion of Kendall's tau.
BiCopGradMMD()
for the computation of the stochastic gradient.
BiCopEst.MO
for the estimation of Marshall-Olkin copulas by MMD.
# Estimation of a bivariate Gaussian copula with correlation 0.5. dataSampled = VineCopula::BiCopSim(N = 500, family = 1, par = 0.5) estimator = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1, niter = 10) estimator$par # Estimation of a bivariate Student copula with correlation 0.5 and 5 degrees of freedom dataSampled = VineCopula::BiCopSim(N = 1000, family = 2, par = 0.5, par2 = 5) estimator = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 2) estimator$par estimator$par2 # Comparison with maximum likelihood estimation with and without outliers dataSampled = VineCopula::BiCopSim(N = 500, family = 1, par = 0.5) estimatorMMD = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1) estimatorMMD$par estimatorMLE = VineCopula::BiCopEst(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1, method = "mle") estimatorMLE$par dataSampled[1:10,1] = 0.999 dataSampled[1:10,2] = 0.001 estimatorMMD = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1) estimatorMMD$par estimatorMLE = VineCopula::BiCopEst(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1, method = "mle") estimatorMLE$par # Estimation of a bivariate Gaussian copula with real data data("daxreturns", package = "VineCopula") BiCopEstMMD(u1 = daxreturns[,1], u2 = daxreturns[,2], family = 1) estimator$par
# Estimation of a bivariate Gaussian copula with correlation 0.5. dataSampled = VineCopula::BiCopSim(N = 500, family = 1, par = 0.5) estimator = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1, niter = 10) estimator$par # Estimation of a bivariate Student copula with correlation 0.5 and 5 degrees of freedom dataSampled = VineCopula::BiCopSim(N = 1000, family = 2, par = 0.5, par2 = 5) estimator = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 2) estimator$par estimator$par2 # Comparison with maximum likelihood estimation with and without outliers dataSampled = VineCopula::BiCopSim(N = 500, family = 1, par = 0.5) estimatorMMD = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1) estimatorMMD$par estimatorMLE = VineCopula::BiCopEst(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1, method = "mle") estimatorMLE$par dataSampled[1:10,1] = 0.999 dataSampled[1:10,2] = 0.001 estimatorMMD = BiCopEstMMD(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1) estimatorMMD$par estimatorMLE = VineCopula::BiCopEst(u1 = dataSampled[,1], u2 = dataSampled[,2], family = 1, method = "mle") estimatorMLE$par # Estimation of a bivariate Gaussian copula with real data data("daxreturns", package = "VineCopula") BiCopEstMMD(u1 = daxreturns[,1], u2 = daxreturns[,2], family = 1) estimator$par
This function computes a stochastic estimate of the gradient of the MMD criterion
for parametric estimation of bidimensional copula family.
The main arguments are the two vectors of observations, and the copula family.
The family is parametrized as in VineCopula::BiCop()
,
using the Kendall's tau instead of the first parameter.
This function is used by BiCopEstMMD()
to perform parameter estimation
via MMD minimization.
BiCopGradMMD( u1, u2, family, tau, par = NULL, par2 = 0, kernel = "gaussian.Phi", gamma = 0.95, alpha = 1, epsilon = 1e-04, method = "QMCV", quasiRNG = "sobol", ndrawings = 10 )
BiCopGradMMD( u1, u2, family, tau, par = NULL, par2 = 0, kernel = "gaussian.Phi", gamma = 0.95, alpha = 1, epsilon = 1e-04, method = "QMCV", quasiRNG = "sobol", ndrawings = 10 )
u1 |
vector of observations of the first coordinate, in |
u2 |
vector of observations of the second coordinate, in |
family |
the chosen family of copulas
(see the documentation of the class |
tau |
the copula family can be parametrized by the parameter |
par |
if different from |
par2 |
value for the second parameter, if any. (Works only for Student copula). |
kernel |
the kernel used in the MMD distance:
it can be a function taking in parameter
Each of these names can receive the suffix |
gamma |
parameter |
alpha |
parameter |
epsilon |
the differential of |
method |
the method of computing the stochastic gradient:
|
quasiRNG |
a function giving the quasi-random points in |
ndrawings |
number of replicas of the stochastic estimate of the gradient drawn at each step. The gradient is computed using the average of these replicas. |
the value of the gradient.
Alquier, P., Chérief-Abdellatif, B.-E., Derumigny, A., and Fermanian, J.D. (2022). Estimation of copulas via Maximum Mean Discrepancy. Journal of the American Statistical Association, doi:10.1080/01621459.2021.2024836.
BiCopEstMMD()
for the estimation of parametric bivariate copulas by
stochastic gradient descent on the MMD criteria.
# Simulation from a bivariate Gaussian copula with correlation 0.5. dataSampled = VineCopula::BiCopSim(N = 500, family = 1, par = 0.5) # computation of the gradient of the MMD criteria at different points # Gradient is small at the true parameter BiCopGradMMD(dataSampled[,1], dataSampled[,2], family = 1, par = 0.5) # Gradient is negative when below the parameter BiCopGradMMD(dataSampled[,1], dataSampled[,2], family = 1, par = 0.1) # and positive when above BiCopGradMMD(dataSampled[,1], dataSampled[,2], family = 1, par = 0.8)
# Simulation from a bivariate Gaussian copula with correlation 0.5. dataSampled = VineCopula::BiCopSim(N = 500, family = 1, par = 0.5) # computation of the gradient of the MMD criteria at different points # Gradient is small at the true parameter BiCopGradMMD(dataSampled[,1], dataSampled[,2], family = 1, par = 0.5) # Gradient is negative when below the parameter BiCopGradMMD(dataSampled[,1], dataSampled[,2], family = 1, par = 0.1) # and positive when above BiCopGradMMD(dataSampled[,1], dataSampled[,2], family = 1, par = 0.8)
Convert between parameter and Kendall's tau for Marshall-Olkin copulas
BiCopPar2Tau.MO(par) BiCopTau2Par.MO(tau)
BiCopPar2Tau.MO(par) BiCopTau2Par.MO(tau)
par |
the parameter of the Marshall-Olkin copula |
tau |
the Kendall's tau of the Marshall-Olkin copula |
Either the Kendall's tau or the parameter of the Marshall-Olkin copula.
Nelsen, R. B. (2007). An introduction to copulas. Springer Science & Business Media. (Example 5.5)
BiCopPar2Tau.MO(par = 0.5) BiCopTau2Par.MO(tau = 1/3)
BiCopPar2Tau.MO(par = 0.5) BiCopTau2Par.MO(tau = 1/3)
This function uses the numerical integration procedure
cubature::hcubature()
to numerical integrate the distance between
the distribution or between the densities of two bivariate copulas.
BiCopParamDistLp( family, par, par_p, par2 = par, par2_p = par_p, family_p = family, p, type, maxEval = 0, truncVal = 0 )
BiCopParamDistLp( family, par, par_p, par2 = par, par2_p = par_p, family_p = family, p, type, maxEval = 0, truncVal = 0 )
family |
family of the first copula. |
par |
first parameter of the first copula. |
par_p |
first parameter of the second copula. |
par2 |
second parameter of the first copula (only useful for two-parameter families of copulas). |
par2_p |
second parameter of the first copula (only useful for two-parameter families of copulas). |
family_p |
family of the second copula. |
p |
determines the |
type |
type of the functions considered.
Can be |
maxEval |
maximum number of evaluation of the function to be integrated.
If 0, then no maximum limit is given. (Only used if |
truncVal |
the distance is computed using the supremum or the integral
of the function on |
If p < Inf
, it returns a list of four items
distance
the value of the distance
integral
the value of the integral, which is
the -th power of the distance.
error
the estimated relative error of the integral
returnCode
the integer return code of the C routine
called by cubature::hcubature()
.
This should be 0 if there is no error.
If p = Inf
, it returns a list of two items
distance
the maximum difference between the two copulas
(respectively, between the two copula densities).
u_max
the point at which this difference is attained.
# Distance between the densities of a Gaussian copula with correlation 0.5 # and a Gaussian copula with correlation 0.2 BiCopParamDistLp(family = 1, par = 0.5, par_p = 0.2, p = 2, type = "cdf", maxEval = 10) BiCopParamDistLp(family = 1, par = 0.5, par_p = 0.2, p = Inf, type = "cdf") # Distance between the cdf of a Student copula # with correlation 0.5 and 4 degrees of freedom # and a Student copula with the same correlation but 20 degrees of freedom BiCopParamDistLp(family = 2, par = 0.5, par_p = 0.5, par2 = 5, par2_p = 20, p = 2, type = "pdf", maxEval = 10) # Distance between the densities of a Gaussian copula with correlation 0.5 # and of a Student copula with correlation 0.5 and 15 degrees of freedom BiCopParamDistLp(family = 1, par = 0.5, par_p = 0.5, par2_p = 15, family_p = 2, p = 2, type = "pdf", maxEval = 10)
# Distance between the densities of a Gaussian copula with correlation 0.5 # and a Gaussian copula with correlation 0.2 BiCopParamDistLp(family = 1, par = 0.5, par_p = 0.2, p = 2, type = "cdf", maxEval = 10) BiCopParamDistLp(family = 1, par = 0.5, par_p = 0.2, p = Inf, type = "cdf") # Distance between the cdf of a Student copula # with correlation 0.5 and 4 degrees of freedom # and a Student copula with the same correlation but 20 degrees of freedom BiCopParamDistLp(family = 2, par = 0.5, par_p = 0.5, par2 = 5, par2_p = 20, p = 2, type = "pdf", maxEval = 10) # Distance between the densities of a Gaussian copula with correlation 0.5 # and of a Student copula with correlation 0.5 and 15 degrees of freedom BiCopParamDistLp(family = 1, par = 0.5, par_p = 0.5, par2_p = 15, family_p = 2, p = 2, type = "pdf", maxEval = 10)
This functions simulates independent realizations from the Marshall-Olkin copula.
BiCopSim.MO(n, alpha)
BiCopSim.MO(n, alpha)
n |
number of samples |
alpha |
parameter of the Marshall-Olkin copula |
an matrix containing the samples
BiCopEst.MO
for the estimation of
Marshall-Olkin copulas.
# Simulation from a Marshall-Olkin copula with parameter alpha = 0.5 BiCopSim.MO(n = 100, alpha = 0.5)
# Simulation from a Marshall-Olkin copula with parameter alpha = 0.5 BiCopSim.MO(n = 100, alpha = 0.5)