[R] read a file of text with read.table

Frede Aakmann Tøgersen frtog at vestas.com
Thu Jun 26 10:47:23 CEST 2014


Hi

Actually I had to read the man before answering Carol. Here it goes:

stringsAsFactors: logical: should character vectors be converted to
          factors?  Note that this is overridden by 'as.is' and
          'colClasses', both of which allow finer control.

So setting colClasses should work. Here is instructions for the colClasses argument:

colClasses: character.  A vector of classes to be assumed for the
          columns.  Recycled as necessary, or if the character vector
          is named, unspecified values are taken to be 'NA'.

          Possible values are 'NA' (the default, when 'type.convert' is
          used), '"NULL"' (when the column is skipped), one of the
          atomic vector classes (logical, integer, numeric, complex,
          character, raw), or '"factor"', '"Date"' or '"POSIXct"'.
          Otherwise there needs to be an 'as' method (from package
          'methods') for conversion from '"character"' to the specified
          formal class.

          Note that 'colClasses' is specified per column (not per
          variable) and so includes the column of row names (if any).

And to complete

   as.is: the default behavior of 'read.table' is to convert character
          variables (which are not converted to logical, numeric or
          complex) to factors.  The variable 'as.is' controls the
          conversion of columns not otherwise specified by
          'colClasses'.  Its value is either a vector of logicals
          (values are recycled if necessary), or a vector of numeric or
          character indices which specify which columns should not be
          converted to factors.

          Note: to suppress all conversions including those of numeric
          columns, set 'colClasses = "character"'.

          Note that 'as.is' is specified per column (not per variable)
          and so includes the column of row names (if any) and any
          columns to be skipped.

I think these are the only settings influencing the conversions. 







Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
frtog at vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Rolf Turner
> Sent: 26. juni 2014 10:39
> To: carol white
> Cc: r-help at r-project.org
> Subject: Re: [R] read a file of text with read.table
> 
> 
> On 26/06/14 19:32, carol white wrote:
> 
> > It might be a primitive question
> 
> All questions are primitive; some questions are more primitive than others.
> 
> > but I have a file of text and there
> > is no separator between character on each line and the strings on
> > each line have the same length. The format is like the following
> > absfjdslf
> > jfdldskjff
> > jfsldfjslk
> >
> > When I read the file with read.table("myfile",colClasses =
> > "character"), instead of putting the strings in a table of number of
> > rows x length of string, read.table saves the file in a table of
> > number of rows x 1 and each element seems to be a factor. Why does
> > read.table not account for  colClasses = "character"?
> 
> (1) You might try setting stringsAsFactors=FALSE rather than
> colClasses = "character".
> 
> (2) Since your "table" has only one column you might as well use scan()
> (with what="") and save wear and tear on the system.
> 
> (3) In your example the strings do *not* have the same length; the first
> has 9 characters, the next two have 10 each.
> 
> (4) Do you want to get a data frame each column of which is a single
> character?  This was not clear from your email.  Do you know how to do
> this?  (It's easy --- when the string lengths are indeed all the same.)
> I appended a "g" to the first string and did:
> 
> ttt <- scan("temp.txt",what="")
> sss <- strsplit(ttt,"")
> rrr <- as.data.frame(do.call(rbind,sss))
> rrr
>    V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
> 1  a  b  s  f  j  d  s  l  f   g
> 2  j  f  d  l  d  s  k  j  f   f
> 3  j  f  s  l  d  f  j  s  l   k
> 
> Is this what you want?
> 
> cheers,
> 
> Rolf Turner
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list