[R] Weird read.xls behavior

David Winsemius dwinsemius at comcast.net
Tue May 10 18:02:19 CEST 2011


On May 10, 2011, at 11:39 AM, Gabor Grothendieck wrote:

> On Tue, May 10, 2011 at 12:12 AM, Jun Shen <jun.shen.ut at gmail.com>  
> wrote:
>> Kenneth,
>>
>> Thanks for the reply. I checked the original data. There is no  
>> space. I even
>> manually added a space to one value. After reading in with  
>> read.xls, the
>> value has two spaces. The reason I don't like it is I am going to  
>> do some
>> comparison with another dataset, which is supposed to be the same  
>> as this
>> one. Now I am getting a bunch of false negatives.
>
> It seems that the perl program underlying gdata's read.xls puts out
> lines like this:

While we are on the topic of gdata functions I just looked at the trim  
function and find that it does return a data.frame when one is offered  
to it. (It was not clear from the documentation that a dataframe fell  
under the classification of "character strings and other related  
objects.")  Using a dataframe in my workspace from an earlier question:

 > require(gdata)
 > Mat1$Time[1] <- "09:30 "
 > Mat1
   Weight     Date   Time
1    7.6 04/28/11 09:30
2    8.4 04/29/11  03:11
3    8.6 04/29/11  05:32
4    8.6 04/29/11  09:53
5    1.4 05/01/11  19:52
 > trim(Mat1)
   Weight     Date  Time
1    7.6 04/28/11 09:30       # no space on my console
2    8.4 04/29/11 03:11
3    8.6 04/29/11 05:32
4    8.6 04/29/11 09:53
5    1.4 05/01/11 19:52
 > nchar(trim(Mat1)$Time[1])
[1] 5
 > nchar(Mat1$Time[1])
[1] 6

-- 
David.
>
> |"KAI-4169-002","830","5 mg" |
> where | mark the beginning and end and are not part of the line.
> read.csv includes the space after the last double quote in the last
> field even though its outside of the double quote.
>
> As an interim fix, edit the file at this location:
>
>   system.file("perl", "xls2csv.pl", package = "gdata")
>
> removing the space before the \n in this line:
>   print OutFile "$outputLine \n"
> so it becomes this:
>   print OutFile "$outputLine\n"
>
> Now it should work.
>
> -- 
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list