[R] Extra rows of 'NAs' in imported dataset

Patrick Burns pburns at pburns.seanet.com
Fri Jan 23 11:08:02 CET 2009


'The R Inferno' page 87 talks about getting
extra columns from data derived from spreadsheets.
It happens because the spreadsheet program
thinks for some reason that the extra cells are
used -- a cell was probably clicked on.

Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")

M-J Milloy wrote:
> Hello all: I'm hoping you can help me determine the source of this problem.
>
> I've just used read.csv to bring a small (581 rows, 9 vars) dataset into R
> (2.7.0., Mac OS 10.5.5). The dataset was created in Excel 2008 from a
> datadump from an Oracle database. I've done this many times before and had
> no problems.
>
> The dataset ("a") appears to have extra rows filled with NAs. For example,
>
>   
>> a[a$mmt.dose == 10, ]
>>     
>        ID COHORT    F st.y st.m st.d days   md mmt.dose
> NA     NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.1   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.2   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.3   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.4   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.5   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> 222    88      V   PC   NA   NA   NA   NA MOSE       10
> NA.6   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.7   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.8   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.9   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> NA.10  NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
> 474   756      V    C 2004   10    1 1553 UNKN       10
>
> I've examined the original CSV file and also exported the "a" dataset to a
> CSV and found no source for these entries.
>
> Any help would be much appreciated!
>
>
> M-J
>
>
> --
>
> PhD student,
> School of Population and Public Health,
> University of British Columbia
> Musqueam Territory, British Columbia
>
> Research Assistant,
> Urban Health Research Institute,
> BC Centre for Excellence in HIV/AIDS
> St. Paul's Hospital,
> Vancouver, Canada
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>




More information about the R-help mailing list