[R] Simulate skewed data if 2.5, 25th 50th and 75 centile are known

Bert Gunter bgunter.4567 at gmail.com
Wed Aug 5 23:21:09 CEST 2015


Hint: See below.

On Wednesday, August 5, 2015, John Sorkin <jsorkin at grecc.umaryland.edu>
wrote:

> Colleagues,
> I need to simulate skewed data so I can run a sample size calculation.
>
> I know the 2.5th, 25th, 50th, and 75th centiles of the data (32, 43, 48,
> 250).
>
> data <- matrix(c(75,250,50,48,25,43,2.5,32),nrow=4,ncol=2,byrow=TRUE)
> dimnames(data) <- list(NULL,c("x","y"))
> data
>
> Is there a way I can use these values to generate simulations of the
> original data? Of course if the data were normally distributed this would
> be a piece of cake,


Oh -- how? ( a normal distribution is defined by 2 parameters. You appear
to have 4. ) If you can answer this question, you can probably answer the
same question for skew data. See also things like Johnson distributions,
Pearson distributions, and other flexible distribution families. You should
also probably move to stackexchange, as this is definitely a statistical
matter. Once you decide what to do, R will have a package to do it.

Others may be able to offer better advice, so wait a bit before proceeding,
though.

-- Bert

but given the skewness, I don't know how to go about the generating the
> values that would be expected from a distribution having the observed
> values at the four centiles.
> Thank you,
> John
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:25}}



More information about the R-help mailing list