[R] Getting codebook data into R

David Winsemius dwinsemius at comcast.net
Thu Feb 9 23:50:39 CET 2012


On Feb 9, 2012, at 3:51 PM, barny wrote:

> I've been trying to get some data from the National Survey for  
> Family Growth
> into R - however, the data is in a .dat file and the data I need  
> doesn't
> have any spaces or commas separating fields - rather you have to  
> look into
> the codebook and what number of digits along the line the data you  
> need is.
> The data I want are the following, where 1,12,int means that the  
> data I'm
> interested starts in column 1 and finishes in column 12 and is an  
> integer.
>
>            ('caseid', 1, 12, int),
>             ('nbrnaliv', 22, 22, int),
>            ('babysex', 56, 56, int),
>            ('birthwgt_lb', 57, 58, int),
>            ('birthwgt_oz', 59, 60, int),
>            ('prglength', 275, 276, int),
>            ('outcome', 277, 277, int),
>            ('birthord', 278, 279, int),
>            ('agepreg', 284, 287, int),
>            ('finalwgt', 423, 440, float)

That's not the way the read.fwf is set up to accept data. You will  
need to loop over that input stream and apply logic like:
vec<numeric(0);
nams <-character(0)
getwidth = first-last+1
vec=c(vec, getwidth)
nams=c(nams, <whatever>)
getwidblank = last-first.next-1
If( getblank>0) namskip= <junk-name>

Then remove all the zeros and that will be  your vector of widths and  
your string of col.names

>
> How can I do this using R? I've written a python programme which  
> basically
> does it but it'd be nicer if I could skip the Python bit and just do  
> it
> using R. Cheers for any help.
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Getting-codebook-data-into-R-tp4374331p4374331.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list