[R] FW: merge multiple csv files

jim holtman jholtman at gmail.com
Fri Feb 8 22:51:29 CET 2008


Don't have your data, but something like this is close:

# something like the following.  read into a list for easier processing
allFile <- Sys.glob("sample*.csv")
results <- lapply(allFiles, function(.file){
    # extract number from file name
    num <- as.integer(sub("^.*?([[:digit:]]+).*", "\\1", .file, perl=TRUE))
    .in <- read.table(.file, skip=5)
    .in$obs <- num
    .in
})

# combine into a single dataframe
result <- do.call(rbind, results)

# now do your processing for average
z <- split(result, result[,1])  # split by first column
do.call(rbind, lapply(z, function(.avg){
    data.frame(x=.avg[1,1], y=mean(.avg[,2]))
}))



On 2/8/08, Gator Connection <gatorconnection at hotmail.com> wrote:
>
>
>
>
>
> Dear list:I have a folder that contains more than 50 csv files labels sequencially like sample01.csv to sample50.csv. for each file the first 5 rows are descriptive of the data collected (useful but not needed in data merge). each file then start the data at row 6 and have 2 variables x and y. In order to know which file one observation is from, I'd like to have a new variable location, for example if the data are from file sample11.csv, then the location for that obs is 11.Another difficulty is there might be two observations actually repetitive, for example sample05.csv might contain (4, 10) and (4, 12). I'd like to average it into (4, 11).  Any suggestions are welcome.Jack
>
> Connect and share in new ways with Windows Live. Get it now!
> _________________________________________________________________
>
>
> 08
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list