[R] read columns of quoted numbers as factors

Peter Dalgaard pdalgd at gmail.com
Wed Oct 6 08:55:36 CEST 2010


On 10/06/2010 02:41 AM, james hirschorn wrote:
> Yes, your solution of setting quote="" would read the multi-word strings 
> incorrectly. A more complicated version of your solution should work: First 
> check which columns are identified as strings, and then apply your solution to 
> the remaining columns.

Probably more painful than that if column separators can appear in
strings. The best I can think of involves trying to reread the columns
that get classified as numeric with colClasses="numeric" and see if they
fail. A general solution likely requires changing scan() at C-level.

> 
> I'm a newbie at R, but it seems to me that there is a "logical inconsistency" in 
> R: write.table puts quotes around numbers when they form a column of factors, 
> but does not put quotes for a column of integers. Since read.table is the "dual" 
> of write.table it seems that it should treat quoted and unquoted columns 
> differently, analogously to write.table. However, there does not even seem to be 
> an option to make read.table behave analogously.

Yes, and far from the only such case in R. (Even more annoying to my
eyes is that factor levels get reordered alphabetically, so write.table
is really not an option for storage of data frames anyway).

However, the quoting of factor levels on output from write.table is not
happening to distinguish numbers from character strings. Rather, it is
for potentially multi-word level names.


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list