[R] Sort across multiple csv

David Winsemius dwinsemius at comcast.net
Fri May 18 20:16:26 CEST 2012


On May 18, 2012, at 12:56 PM, Matthew Ouellette wrote:

> Dear R help list,
>
> I am very new to R and I apologize in advance if this has been  
> answered
> before.  I have done my best to google/R search what I need but no  
> luck.
> Here is what I am attempting:
>
> I have hundreds of .csv files that I need to sort based on a single  
> column
> of alphanumeric data.  All of the files contain matrices that have
> identical dimensions and headers, however the data table doesn't begin
> until the 74th line in each file.  Doing some searching, I have been  
> able
> to create an object with elements consisting of each file in the  
> folder
> containing the targets (please note this is my working directory):
>
> filenames<-list.files()
> alldata<-lapply(filenames, read.csv, skip=73, header=TRUE)
>
> At this point I believe I have created an object with N elements  
> (where N=#
> files in the wd), each containing the matrix I am attempting to  
> sort.  I am
> completely lost as to how I can sort each matrix

You should learn to use precise terminology to refer to R objects. You  
have a list of dataframes (not matrices)

You can loop over then and return a list of transformed (.e.g. sorted)  
dataframes:

alldata <- lapply (alldata, function(x) x[order(x[["Name"]], ] )

> based on a single column
> (say, "Name") and then either overwrite

The above code would overwrite.

> the source files or write to a new
> directory all of the sorted data.

If you didn't want it overwritten then assign it to a different name.

>  I half wonder if I should be creating
> individual objects for each file that I read in, but I haven't been  
> able to
> figure this out either.

Much better to stick with lists.

>  Please note that I am trying to sort these files
> individually - would a loop be more efficient?

`lapply` is really a loop.

>
> I appreciate the help,
> BustedAvi
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list