[R] Read_fwf in package readr, double vs. numeric

Doran, Harold HDor@n @end|ng |rom @|r@org
Wed Apr 24 16:56:28 CEST 2019


Suppose I have the following data sitting in a fwf file 'foo.txt'. The point of this email is to ask the group how to properly read in the value in this pseudo-data "1e-20" using the read_fwf function in the package readr.

11e-201043
1712201043
1912201055

First, suppose I do it this way, where in this case "D" is used for double precision.

library(readr)
pos <- fwf_positions(c(1,2,7), c(1,6,10))
type <- c('N','D','N')
types <- paste0(type, collapse = '')
types <- chartr('NCD', 'ncd', types)  

read_fwf(file = myFile, col_positions = pos, col_types = types)

# A tibble: 3 x 3
     X1       X2    X3
  <dbl>    <dbl> <dbl>
1     1 1.00e-20  1043
2     1 7.12e+ 4  1043
3     1 9.12e+ 4  1055

This seemingly works well and properly captures the value. However, if I instead were to indicate to the function that *all* of my columns were numeric (just insert this one line in lieu of the other above)

type <- c('N','N','N')

# A tibble: 3 x 3
     X1    X2    X3
  <dbl> <dbl> <dbl>
1     1     1  1043
2     1 71220  1043
3     1 91220  1055

The read in is not correct. Here is the pragmatic issue. I have a legacy program that spits out the layout structure of the fwf file (start, end positions) and also indicates what the column types are. This layout file we receive always uses a column type of numeric (N) for any numeric types (including the column holding values such as 1e-20). 

This layout file will not change so I need to figure out how to solve the problem within my read in program. I suppose one option is that I could manually change any values of "N" to "D" in my R code. That seems to work. But not sure if that is the "right" way to solve this issue.

Thanks
Harold



More information about the R-help mailing list