[R] retaining characters in a csv file

Arunkumar Srinivasan aragorn168b at gmail.com
Wed Sep 23 01:11:02 CEST 2015


data.table's fread reads this as expected. Quoted strings aren't coerced.

sapply(fread('5724550,"000202075214",2005.02.17,2005.02.17,"F"\n'), class)
#          V1          V2          V3          V4          V5
#   "integer" "character" "character" "character" "character"

Best,
Arun.

On Wed, Sep 23, 2015 at 12:00 AM, Therneau, Terry M., Ph.D.
<therneau at mayo.edu> wrote:
> I have a csv file from an automatic process (so this will happen thousands
> of times), for which the first row is a vector of variable names and the
> second row often starts something like this:
>
> 5724550,"000202075214",2005.02.17,2005.02.17,"F", .....
>
> Notice the second variable which is
>       a character string (note the quotation marks)
>       a sequence of numeric digits
>       leading zeros are significant
>
> The read.csv function insists on turning this into a numeric.  Is there any
> simple set of options that
> will turn this behavior off?  I'm looking for a way to tell it to "obey the
> bloody quotes" -- I still want the first, third, etc columns to become
> numeric.  There can be more than one variable like this, and not always in
> the second position.
>
> This happens deep inside the httr library; there is an easy way for me to
> add more options to the read.csv call but it is not so easy to replace it
> with something else.
>
> Terry T
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list