[R] Conditional logistic regression for "events/trials" format
Charles C. Berry
cberry at tajo.ucsd.edu
Thu May 31 19:11:50 CEST 2007
On Thu, 31 May 2007, Strickland, Matthew (CDC/CCHP/NCBDDD) (CTR) wrote:
> Dear R users,
> I have a large individual-level dataset (~700,000 records) which I am
> performing a conditional logistic regression on. Key variables include
> the dichotomous outcome, dichotomous exposure, and the stratum to which
> each person belongs.
> Using this individual-level dataset I can successfully use clogit to
> create the model I want. However reading this large .csv file into R and
> running the models takes a fair amount of time.
> Alternatively, I could choose to "collapse" the dataset so that each row
> has the number of events, number of individuals, and the exposure and
> stratum. In SAS they call this the "events/trials" format. This would
> make my dataset much smaller and presumably speed things up.
I think you have described the data for forming a 2 by 2 by K table of
In which case, loglin(), loglm(), mantelhaen.test(), and - if K is not too
large - glm(... , family=poisson) would be suitable.
But you say 'models' above suggesting that there are some other
variables. If so, you need to be a bit more specific in describing your
> So my question is: can I use clogit (or possibly another function) to
> perform a conditional logistic regression when the data is in this
> "events/trials" format? I am using R version 2.5.0.
> Thank you very much,
> Matt Strickland
> Birth Defects Branch
> U.S. Centers for Disease Control
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901
More information about the R-help