[R] rewrite a data file use write.table(), count.fields() show different pattern, any suggestion appreciated.

Yong Wang wangyong1 at gmail.com
Tue May 22 16:30:06 CEST 2007


Thank you for the suggestion, Dr. Ripley

However, I am a little bit confused. My understanding is that you
suspect the should-be-quoted fields (factor or character fields)
contains tabs.

if this is the case,  count.fields()  should detect the tab,
read.table(sep="t\") should read with the same awareness, and if
write.table(sep"\t") write and seperate with tab those fields as
acknowldged by read.table(sep="t\"), the two field counts should be
the same.

anyway, I will try to redo it per your suggestion.

Regards
yong


On 5/22/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> If you write out unquoted fields, how do you know they do not contain
> tabs?
>
> The default is quote=TRUE for a good reason.
>
> On Tue, 22 May 2007, Yong Wang wrote:
>
> > Dear all:
> >
> > I read in a tab delimited dataset, and then write it out as another
> > file as following: I did this simply to make sure I understand the
> > behavior of this command.
> >
> > data<-read.table(file,header=F,sep="\t",fill=T,colClasses="character");
> > write.table(data,file="newdata.txt",eol="\n",sep="\t",quote=F,row.names=F);
> >
> >
> > cf1 <- count.fields(newdata.txt, sep="\t")
> > table(cf1)
> > 13   17       23
> > 10   126   5445
> >
> > # is different to
> >
> > cf2 <- count.fields(file,sep="\t")
> > 13   17       23        33
> > 10   106   5433      32
> >
> > the worst problem is the maximal value of cf1 (33) is larger than the
> > maximal value of cf2 (23) which is the right number of fields for most
> > rows in the original file.
> >
> > I need to use write.table for some important data manipulation work,
> > your suggestion is
> > highly appreciated.
> >
> > Best Regards
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



More information about the R-help mailing list