[R] Building package - tab delimited example data issue

Johannes Graumann johannes_graumann at web.de
Thu Dec 6 13:03:20 CET 2007


On Thursday 06 December 2007 11:52:46 Peter Dalgaard wrote:
> Johannes Graumann wrote:
> > Hello,
> >
> > I'm trying to integrate example data in the shape of a tab delimited
> > ASCII file into my package and therefore dropped it into the data
> > subdirectory. The build works out just fine, but when I attempt to
> > install I get:
> >
> > ** building package indices ...
> > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
> > na.strings,  :
> >   line 1 did not have 500 elements
> > Calls: <Anonymous> ... <Anonymous> -> switch -> assign -> read.table ->
> > scan Execution halted
> > ERROR: installing package indices failed
> > ** Removing '/usr/local/lib/R/site-library/MaxQuantUtils'
> > ** Restoring previous '/usr/local/lib/R/site-library/MaxQuantUtils'
> >
> > Accordingly the check delivers:
> >
> > ...
> > * checking whether package 'MaxQuantUtils' can be installed ... ERROR
> >
> > Can anyone tell me what I'm doing wrong? build/install witout the ASCII
> > file works just fine.
> >
> > Joh
>
> If you had looked at help(data), you would have found a list of which
> file formats it supports and how they are read. Hint: TAB-delimited
> files are not among them. *Whitespace* separated files work, using
> read.table(filename, header=TRUE), but that is not a superset of
> TAB-delimited data if there are empty fields.
>
> A nice trick is to figure out how to read the data from the command line
> and drop the relevant code into a mydata.R file (assuming that the
> actual data file is mydata.txt). This gets executed when the data is
> loaded (by data(mydata) or when building the lazyload database) because
> .R files have priority over .txt.
>
> This is quite general and allows a nice way of incorporating data
>
> management while retaining the original data source:
> >more ISwR/data/stroke.R
>
> stroke <-  read.csv2("stroke.csv", na.strings=".")
> names(stroke) <- tolower(names(stroke))
> stroke <-  within(stroke,{
>     sex <- factor(sex,levels=0:1,labels=c("Female","Male"))
>     dgn <- factor(dgn)
>     coma <- factor(coma, levels=0:1, labels=c("No","Yes"))
>     minf <- factor(minf, levels=0:1, labels=c("No","Yes"))
>     diab <- factor(diab, levels=0:1, labels=c("No","Yes"))
>     han <- factor(han, levels=0:1, labels=c("No","Yes"))
>     died <- as.Date(died, format="%d.%m.%Y")
>     dstr <- as.Date(dstr,format="%d.%m.%Y")
>     dead <- !is.na(died) & died < as.Date("1996-01-01")
>     died[!dead] <- NA
> })
>
> >head ISwR/data/stroke.csv
>
> SEX;DIED;DSTR;AGE;DGN;COMA;DIAB;MINF;HAN
> 1;7.01.1991;2.01.1991;76;INF;0;0;1;0
> 1;.;3.01.1991;58;INF;0;0;0;0
> 1;2.06.1991;8.01.1991;74;INF;0;0;1;1
> 0;13.01.1991;11.01.1991;77;ICH;0;1;0;1
> 0;23.01.1996;13.01.1991;76;INF;0;1;0;1
> 1;13.01.1991;13.01.1991;48;ICH;1;0;0;1
> 0;1.12.1993;14.01.1991;81;INF;0;0;0;1
> 1;12.12.1991;14.01.1991;53;INF;0;0;1;1
> 0;.;15.01.1991;73;ID;0;0;0;1

Thanks for your help. Very insightfull and your version of "RTFM" was not to 
harsh either ;0)
Part of what I want to achieve with the inclusion of the file is to be able to 
showcase a read-in function for the particular data type. Is there a slick 
way - sticking to your example - to reference the 'stroke.csv' directly?
I'd like to put in the example of some function.Rd something analogous to
	# Use function to read in file:
	result <- function(<link to 'stroke.csv' in installed ISwR package>)
Without having to resort to accepting the example as "No Run".

Thanks for your help, Joh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: This is a digitally signed message part.
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20071206/9c4b5910/attachment.bin 


More information about the R-help mailing list