[R] How would I analyse data like this?

Jason Turner jasont at indigoindustrial.co.nz
Wed Mar 19 20:13:18 CET 2003


On Wed, Mar 19, 2003 at 12:40:20PM -0500, laurent.duperval at microcell.ca wrote:
> On 19 Mar, james.holtman at convergys.com wrote:
> > Have you tried:
> >       data <- read.table("data.dat", header=TRUE, sep="|", as.is=TRUE)
> > 
> 
> Yes I did. However, it takes a LOT more time because of the date/time
> string. The result looks like this:
> 
> 
> str(data)
> `data.frame':	317437 obs. of  8 variables:
>  $ phone   : num  1.52e+10 1.42e+10 1.82e+10 1.65e+10 1.65e+10 ...
>  $ state   : int  3 3 3 3 3 3 3 3 3 3 ...
>  $ code    : int  983 983 983 983 3000 983 983 983 983 5203 ...
>  $ amount  : int  1000 1000 2500 2500 2500 1000 1000 2500 2500 2500 ...
>  $ left    : int  260 0 0 25 0 1260 273 0 0 0 ...
>  $ channel : Factor w/ 5 levels "CSR","IN","IVR",..: 2 5 4 2 3 2 2 3 4 3 ...
>  $ time    : Factor w/ 312198 levels "2002-10-16 ..",..: 1 2 3 4 5 6 7 8 9 10 ...
>  $ mtd     : Factor w/ 2 levels "C","D": 1 1 1 1 1 1 1 1 1 1 ...
> 
> I think the 312198 factor level is wrong. Also, the phone column is  a string,
> not a number. I didn't see how to specify that with read.table(). (In my
> original post, I think I forgot to mention that I had over 300,000 entries in
> my file).

Check out the colClasses argument to read.table.  Something like...

library(methods) #necessary for colClasses

data <- read.table("data.dat", header=TRUE, sep="|", 
		colClasses=c("character","integer","integer",
				"integer","integer","character","character",
				"character"))

You can convert the items you need to be factors after they're loaded,
like this...

data$mtd <- factor(data$mtd)

Hope it helps

Jason
-- 
Indigo Industrial Controls Ltd.
64-21-343-545
jasont at indigoindustrial.co.nz



More information about the R-help mailing list