[R] Recommendation for dealing with mixed input types in CSV

David Winsemius dwinsemius at comcast.net
Thu Oct 29 21:02:29 CET 2009


On Oct 29, 2009, at 3:26 PM, Jason Rupert wrote:

> Currently I have a CSV with mixed input types that I am trying to  
> read in and reformat without having to list off all the column  
> names.  Below is an example of the data:
>
> HouseColor, HouseSize, HouseCost
> Blue, 1600, 160e3
> Blue, 1600, 160e3
>
> Actually I have about 60 columns like this, so imagine the above  
> repeated about 30 times column-wise.
>
> Luckily the ones in scientific notation are grouped together, i.e.  
> columns 11-56.
>
> Using read.csv or as.numeric, is there a way to convert all those in  
> scientific format over to general numeric syntax?

Option 1: do it in the read step. (in my experience the more dificult  
and error-prone method when you are starting out.)

?read.table

see section on colClasses, and define your columns as "character" or  
"numeric" appropriately.

Option 2:
Read them in with as.is=TRUE, and stringsAsFactors=FALSE

convert them in a loop

for (i in 11:56) DFhouses[, i] <- as.numeric( DFhouses[, i] )


>
> Right now I have something like the following
> input_df<-read.csv(InputFile, skip=0, header=TRUE, strip.white = TRUE)
>
> I tried:
> as.numeric(input_df[, 11:56])
> but this returns an error
> Error: (list) object cannot be coerced to type 'double'
>
> Oddly it does appear to work successfully row- 
> wiseas.numeric(input_df[1, 11:56])
> as.numeric(input_df[2, 11:56])
> etc.
>
> However, trying it on multiple rows produces the same error as above:
> as.numeric(input_df[1:2, 11:56])
>
> After a bit, I became a bit frustrated that this was not working so  
> I tried just deleting the columns:
> input_df[1, 11:56]<-NULL
>
> This also failed, so are there any suggestions about how to convert  
> the values in scientific notation over to standard numeric syntaix?
>
> Thank you again again for all your insights and feedback.
>
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list