[R] Simulate data with binary outcome

Steve Frost S.Frost at uws.edu.au
Wed Jul 16 07:40:24 CEST 2008


Dear R-Users,
             I wish to simulate a binary outcome data set with
predictors (in the example below, age, sex and systolic BP). Is there a
way I can set the frequency of the outcome (y) to be say 5% (versus the
0.1% when using the seed below)?

# Example R-code based on Frank Harrell's Design help files

library(Hmisc)
n <- 1000
set.seed(123456)
age <- runif(n, 60, 90)
sbp <- rnorm(n, 120, 15)
sex <- factor(sample(c('female','male'), n,TRUE))

# Specify population model for log odds that CHD = Yes
L  <- 0.4*(sex == 'male') +
      0.045*(age) +
      0.05*(sbp)

# Simulate binary y to have Prob(y = 1) = 1/[1+exp(-L)]

y <- ifelse(runif(n) < plogis(L), 1, 0)
table(y)

ddist <- datadist(sex,age,sbp)
options(datadist = 'ddist')

fit <- lrm(y ~ sex + age + sbp)

summary(fit)


================================
Steve Frost MPH
University of Western Sydney
Building 7
Campbelltown Campus
Locked Bag 1797
PENRITH SOUTH DC 1797
Phone 61+ 2 4620 3415
Mobile 0407 291088
Fax 61+ 2 4625 4252
================================



More information about the R-help mailing list