[R] File conca.

PIKAL Petr petr@p|k@| @end|ng |rom prechez@@cz
Wed Nov 6 08:50:56 CET 2019


Hi

in line

> -----Original Message-----
> From: Val <valkremk using gmail.com>
> Sent: Wednesday, November 6, 2019 3:24 AM
> To: PIKAL Petr <petr.pikal using precheza.cz>
> Cc: r-help using R-project.org (r-help using r-project.org) <r-help using r-project.org>
> Subject: Re: [R] File conca.
>
> Thank you Petr and Jeff fro your suggestions.
>
> I made some improvement but  still need some tweaking.  I could not get
> correctly the folders names added to each row. Only the last forename was
> added.
> table(Alldata$oldername) resulted
>    week2
>     25500
>
> Please see the complete,
>
> ####################################################
> folders=c("week1","week2")
> for(i in folders){
>   path=paste("\data\"", i , sep = "")
>   wd <-  setwd(path)
>   Flist = list.files(path,pattern = "^WT")
>   dataA =  lapply(Flist, function(x)read.csv(x, header=T))
>   setwd(wd)
>   temp = do.call("rbind", Alldata)


Shouldn't it be
temp = do.call("rbind", dataA)


This is problematic piece

>   temp$foldername <- i # this seems to be OK

But these in each cycle put recent temp in Alldata and adds temp again by 
rbinding.

>   Alldata <- temp
>   Alldata <- rbind(Alldata, temp)

I understand from your description that you want all data from all files in 
one Alldata object.

You could either read the files from first folder and put them into Alldata 
**before** your cycle.
Alldata <- temp
After you declare Alldata in such way, you could use only

Alldata <- rbind(Alldata, temp)

in your cycle to add data from other folders.

Or you could use some incremental variable to check if it is the first run.

something like

k <- 0

for(i in folders){...
k <- k+1
....
if (k==1)    Alldata <- temp else Alldata <- rbind(Alldata, temp)
...
}

Cheers
Petr

> }
> #######################################################
> Any suggestion please?
>
>
> On Tue, Nov 5, 2019 at 2:13 AM PIKAL Petr <petr.pikal using precheza.cz> wrote:
> >
> > Hi
> >
> > Help with such operations is rather tricky as only you know exact
> > structrure of your folders.
> >
> > see some hints in line
> >
> > > -----Original Message-----
> > > From: R-help <r-help-bounces using r-project.org> On Behalf Of Val
> > > Sent: Tuesday, November 5, 2019 4:33 AM
> > > To: r-help using R-project.org (r-help using r-project.org)
> > > <r-help using r-project.org>
> > > Subject: [R] File conca.
> > >
> > > Hi All,
> > >
> > > I have data files in several folders and want combine all  these
> > > files in
> > one
> > > file.  In each folder  there are several files  and these
> > > files have the same structure but different names.   First, in each
> > > folder  I want to concatenate(rbind) all files in to one file. While
> > > I am reading each files and concatenating (rbind) all files, I want
> > > to added
> > the
> > > folder name as one variable  in each row. I am reading the folder
> > > names from a file and for demonstration I am using only two folders
> > > as shown below.
> > > Data\week1             # folder name 1
> > >            WT13.csv
> > >            WT26.csv           ...
> > >            WT10.csv
> > > Data\week2            #folder name 2
> > >            WT02.csv
> > >            WT12.csv
> > >
> > > Below please find  my attempt,
> > >
> > > folders=c("week1","week2")
> > > for(i in folders){
> > >   path=paste("\data\"", i , sep = "")
> > >   setwd(path)
> >
> > you should use
> > wd <- setwd(path)
> >
> > which keeps the original directory for subsequent use
> >
> > >   Flist = list.files(path,pattern = "^WT")
> > >   dataA =  lapply(Flist, function(x)read.csv(x, header=T))
> > >   Alldata = do.call("rbind", dataA)     # combine all files
> > >   Alldata$foldername=i                  # adding the folder name
> > >
> >
> > now you can do
> >
> > setwd(wd)
> >
> > to return to original directory
> > }
> >
> > > The above works for  for one folder but how can I do it for more
> > > than one folders?
> >
> > You also need to decide if you want all data from all folders in one
> > object called Alldata or if you want several Alldata objects, one for each
> folder.
> >
> > In second case you could use list structure for Alldata. In the first
> > case you could store data from each folder in some temporary object
> > and use rbind directly.
> >
> > something like
> >
> > temp <- do.call("rbind", dataA)
> > temp$foldername <- i
> >
> > Alldata <- temp
> > in the first cycle
> > and
> > Alldata <- rbind(Alldata, temp)
> > in second and all others.
> >
> > Or you could initiate first Alldata manually and use only Alldata <-
> > rbind(Alldata, temp)
> >
> > in your loop.
> >
> > Cheers
> > Petr
> >
> > >
> > > Thank you in advance,
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html and provide commented, minimal, self-contained,
> > > reproducible code.


More information about the R-help mailing list