[R] Loading large .pxt and .asc datasets causes issues.

Anthony Damico ajdamico at gmail.com
Wed Feb 24 04:02:47 CET 2016


hi eiko, LaF is incompatible with survey data, that road is a dead-end.
this code below will painlessly load brfss into R, review the link douglas
sent for analysis examples and change `years.to.download <- ` to 2006 only
if you just want a single year of microdata.  glhf


# install.packages( c("MonetDB.R", "MonetDBLite" , "survey" , "SAScii" ,
"descr" , "downloader" , "digest" ) , repos=c("
http://dev.monetdb.org/Assets/R/", "http://cran.rstudio.com/"))

# setInternet2( FALSE )                        # # only windows users need
this line
# options( encoding = "windows-1252" )        # # only macintosh and *nix
users need this line
library(downloader)
# setwd( "C:/My Directory/BRFSS/" )
years.to.download <- 1984:2014
source_url( "
https://raw.githubusercontent.com/ajdamico/asdfree/master/Behavioral%20Risk%20Factor%20Surveillance%20System/download%20all%20microdata.R"
, prompt = FALSE , echo = TRUE )





On Tue, Feb 23, 2016 at 4:39 PM, Federman, Douglas <
Douglas.Federman at utoledo.edu> wrote:

> You might want to look at Anthony Damico's work at
>
>
> http://www.asdfree.com/search/label/behavioral%20risk%20factor%20surveillance%20system%20%28brfss%29
>
> --
> Better name for the general practitioner might be multispecialist.
> ~Martin H. Fischer (1879-1962)
>
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Torvon
> Sent: Tuesday, February 23, 2016 2:13 PM
> To: r-help at r-project.org
> Subject: [R] Loading large .pxt and .asc datasets causes issues.
>
> Hi,
>
> I want to load a dataset into R. This dataset is available in two formats:
> .XPT and .ASC. The dataset is available at
> http://www.cdc.gov/brfss/annual_data/annual_2006.htm.
>
> They are about 40mb zipped, and about 500mb unzipped.
>
> I can get the .xpt data to load, using:
>
> > library(hmisc)
> > data <- sasxport.get("CDBRFS06.XPT")
>
> The data look fine, no error messages. However, the data only contains 302
> columns, which is less than it should have (according to the
> documentation). It does not contain my variables of interest, so either the
> documentation or the data file is wrong, and I want to make sure it's not
> the data file.
>
> Hence I wanted to see if I get the same results loading the .ASC file.
> However, multiple ways to do so have failed.
>
> > library(adehabitat)
> > import.asc("CDBRFS06.asc")
>
> Results in:
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> : scan() expected 'a real', got '1191.8808943.38209868648.960119'
>
> > library(SDMTools)
> > read.asc("CDBRFS06.asc")
>
> Results in:
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> : scan() expected 'a real', got '1191.8808943.38209868648.960119' In
> addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote,
> skip, nlines, na.strings, : number of items read is not a multiple of the
> number of columns 2: In scan(file, what, nmax, sep, dec, quote, skip,
> nlines, na.strings, : number of items read is not a multiple of the number
> of columns 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines,
> na.strings, : number of items read is not a multiple of the number of
> columns 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines,
> na.strings, : number of items read is not a multiple of the number of
> columns 5: In scan(file, nmax = nl * nc, skip = 6, quiet = TRUE) : NAs
> introduced by coercion to integer range
>
> Thank you for your help.
>    Eiko
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list