[R] Problem using read.xls - Everything converted to factors

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jun 3 17:13:43 CEST 2011


On Fri, Jun 3, 2011 at 10:24 AM, Sebastian Lerch <lerch at lavabit.com> wrote:
> Hallo,
>
> I would like to use to read.xls function from the gdata package to read data
> from Microsoft Excel files but I experienced a problem: For example I used
> the following code:
>
> testfile<-read.xls("/home/.../wsjecon0603.xls", #file path
>           header=F,
>           dec=",",
>           na.strings="n.a.",
>           skip=5,
>           sheet=2,
>           col.names=c("Name", "Firm","GDP1","GDP2","GDP3","GDP4","CPI5",
>
>  "CPI11","UNEMP5","UNEMP11","PROF03","PROF04","STARTS03","STARTS04"),
>           nrows=54,
>
> #colClasses=c(character,character,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric)
>
> )
> print(testfile)
>
> Although the xls file contains numeric values in all the columns except the
> ones which I named "Name" and "Firm", everything in the data frame has
> "factor" as class. I tried to use the colClasses option as above and as well
> with " "'s around each word, but this does not work and I will always
> receive the following error:
>
> Fehler in is(object, Class) :
>  versuche einen Slot "className" von einem Objekt der einfachen Klasse
> ("list") ohne Slots anzufordern
> Calls: read.xls -> read.csv -> read.table -> <Anonymous> -> is
>
> After some hours of reasearch I figured out how I can manually change the
> classes of the columns:
>
> testfile$GDP2<-as.numeric(levels(testfile$GDP2))[testfile$GDP2]
> testfile$Name<-as.character(levels(testfile$Name))[testfile$Name] #and so on
>
> This works, but is a lot of work since I have to import many different data
> sets. So I was wondering if there is another way to let the classes be
> recognized correctly.
>
> Additionally I would like to know if there is any way to import data from
> different sheets with the same layout at once into one data frame.
>
> I use Ubuntu 11.04 with Rkward if this is of any importance.
>

Assuming you are the gdata package then read.xls has a ... argument
which it passes to read.table so see ?read.table .  In particular,
as.is = TRUE prevents conversion to factors and any column which has
even one non-numeric will not be regarded as numeric.  You can rbind
the results from different sheets if they have same layout.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list