[R] General copula model with heterogeneous marginals

Justin Balthrop justin.balthrop at rice.edu
Mon Nov 30 16:19:59 CET 2015


	I am looking to model the sum of a number of random variables with  
arbitrary gamma distributions and an empirical dependence structure  
that I obtain from data. Basically I observe all of the individual  
pieces but I want to model their sum, as opposed to many copula  
questions which observe a single outcome of a multivariate process and  
seek to fit possible marginal and covariance structure.
It has been years since I coded in R, but this is what I have thus far:

library(copula)
library(scatterplot3d)
library(psych)
set.seed(1)
myCop<-  
normalCopula(param=c(.1,.1,.1,.1,.1,.2,.2,.2,.2,.2,.2,.2,.4,.4,.4,.4,.4,.5,.5,.5,.5), dim=7,  
dispstr="un")
myMvd<-mvdc(copula=myCop, margins=rep("gamma",7),  
paramMargins=list(list(shape=3,scale=4),
   list(shape=2, scale=5),
   list(shape=2, scale=5),
   list(shape=2, scale=5),
   list(shape=2, scale=5),
   list(shape=3, scale=5),
   list(shape=3, scale=5)))

simulation<- rMvdc(20000,myMvd)

colnames(simulation)<-c("P1","P2","P3","P4","P5","P6","P7")

total =  
simulation[,1]+simulation[,2]+simulation[,3]+simulation[,4]+simulation[,5]+simulation[,6]+simulation[,7]

As you can see, I have forced 7 gamma distributions with a placeholder  
covariance matrix input. The problem is that I am looking to  
generalize this to the order of ~150 different marginals with  
potentially differing distributions and parameters.
Ultimately I will have the following input:
•	matrix of 150 marginal distributions with family and parameters
•	150x150 covariance matrix
And what I need to produce is the following:
An empirical CDF/PDF of the sum of realizations from 5-10 of the  
underlying marginal distributions. To be more clear, assume each  
marginal distribution is a person's response to a treatment, and I  
need to calculate the cumulative treatment effect for a sub-group of  
the population of 150. So, I will have a vector of 0s and 1s to  
identify which members of the population are grouped together for a  
trial. Then I will have a separate vector for the next group. Each  
group vector will have dim=150 but have between 5 and 10 1s with the  
rest 0s. I need a different empirical CDF for each vector.
Any help?



More information about the R-help mailing list