[R] Problem passing data into read.table()
David R. McWillliams
dmcwilli at utk.edu
Mon Apr 22 21:21:34 CEST 2002
I thought about scan(), but I think that would require knowing the number
of columns beforehand. As I noted before, if I put in the number of
rows in the second call to read.table(), say 'nrows=2000', it correctly
detects the number of columns and rows. I just can't get it to see the
calculated value as in 'nrows=row.ctr'. The intervening print() can fetch
the value of row.ctr, so I don't understand why read.table() can't get it.
On Mon, 22 Apr 2002, Huntsinger, Reid wrote:
>Date: Mon, 22 Apr 2002 15:02:31 -0400
>From: "Huntsinger, Reid" <reid_huntsinger at merck.com>
>To: 'David R. McWillliams' <dmcwilli at utk.edu>, r-help at stat.math.ethz.ch
>Subject: RE: [R] Problem passing data into read.table()
>I think you should avoid read.table here. It tries to set various parameters
>(number of columns, etc) by looking at a part of your file. Even if you
>specify col.names, it can get confused by this looking-ahead. Better look at
>the definition of read.table in terms of scan and readLines and modify to
>Perhaps your problem isn't that nrows doesn't get the right value, but that
>read.table is using the wrong number of columns. (Actually, I can't create
>an example like yours that read.table doesn't complain about line 1 not
>having n elements, where n is the number of elements in the longest line.)
>If you really want to make read.table work, you can probably use the
>fill=TRUE option and lots of caution.
>From: David R. McWillliams [mailto:dmcwilli at utk.edu]
>Sent: Monday, April 22, 2002 1:13 PM
>To: r-help at stat.math.ethz.ch
>Subject: [R] Problem passing data into read.table()
>I am trying to read in a tab-delimited data file with a 21 row header and
>2 row footer using two calls to read.table(). Numbers of rows and columns
>are variable. The header contains information for calculating the number
>of rows of data. I can successfully pick this out and calculate the
>number of rows to read, but cannot get the second read.table() to assign
>this number to "nrows" (the number is correct; if I enter it manually,
>the everything works fine). Currently the function reads all the way to
>the end and crashes on the footer, since the number of fields is different
>from that of the data.
>I know this could easily be done with some Perl pre-processing of the
>file, but it is going to run on a Windows machine and I am trying to
>minimize the number of packages to download. Nevertheless, there is the
>general problem of why I cannnot pass a calculated value into the
># function to read data with header and footer
># pick line 8 with the data layout information and calculate the number of
>grid.layout <- read.table(fname, as.is=T, header=F, sep="\t",
>comment.char="", skip=7, nrows=1)
>row.ctr <- grid.layout*grid.layout*grid.layout*grid.layout
># tells me I have the right dimensions ...
># but they do not get passed into this read.table() ...
>tmp.df <- read.table(fname, as.is=T, header=T, sep="\t", comment.char="",
>skip=20, nrows=row.ctr )
># end function
>David R. McWilliams
>dmcwilli at utk.edu
>r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
>Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.
David R. McWilliams
dmcwilli at utk.edu
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help