[R] Data manipulation problem

Bert Gunter gunter.berton at gene.com
Mon Apr 5 20:59:36 CEST 2010


You have tempted, and being weak, I yield to temptation:

"Any good ideas?"

Yes. Don't do this.

(what you probably really want to do is fit a model with age as a factor,
which can be done statistically e.g. by logistic regression; or graphically
using conditioning plots, e.g. via trellis graphics (the lattice package).
This avoids the arbitrariness and discontinuities of binning by age range.)

Bert Gunter
Genentech Nonclinical Biostatistics
 
 -----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of moleps
Sent: Monday, April 05, 2010 11:46 AM
To: r-help at r-project.org
Subject: [R] Data manipulation problem

Dear R´ers.

I´ve got a dataset with age and year of diagnosis. In order to
age-standardize the incidence I need to transform the data into a matrix
with age-groups (divided in 5 or 10 years) along one axis and year divided
into 5 years along the other axis. Each cell should contain the number of
cases for that age group and for that period. 

I.e.
My data format now is
ID-age (to one decimal)-year(yearly data).

What I´d like is 


age 1960-1965 1966-1970 etc...
0-5 3 8 10 15
6-10 2 5 8 13
etc..


Any good ideas?

Regards,
M

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list