[R] Looping multiple output values to dataframe

Stropharia stevenworthington at hotmail.com
Thu Feb 12 21:38:34 CET 2009


Thanks a lot Levi. Your code was much shorter and more elegant. With a few
minor alterations I got this (see below) to work.

Does anyone know if there is a way to automate getting only the csv
filenames in a folder (rather than the whole file path)? Or to automate
extracting the file names from the file paths, once they have been extracted
using Sys-glob? Thanks.

Steve

# ----------------------------------------START
R-CODE-----------------------------------
filenames <- Sys.glob("/Users/Desktop/Test/*.csv")  # get names of files to
process # use * to get all

variables <- data.frame(1:length(filenames)) # preallocate assuming multiple
values from each file # creates a dataframe with the same length of rows as
the number of .csv files to process

docalc <- function(filenames){
    input <- read.csv(filenames, header=TRUE, na.strings="NA")
    attach(input)
    result.A <- x[2]*y[1]
    result.B <- y[2]-x[1]
    result.C <- x[3]+y[1]
    results <- c(result.A, result.B, result.C) # concatenate result vectors
    names(results) <- c("ResultA", "ResultB", "ResultC") # set names for
result vectors
    return(results)
}

variables <- t(sapply(filenames, docalc)) # transpose and sapply filenames 

# export to csv file
write.csv(variables, file="/Users/Desktop/Test.csv")
# ----------------------------------------END
R-CODE-----------------------------------






Levi Waldron-3 wrote:
> 
> Stropharia wrote:
>> # ----------------------------------------START
>> R-CODE-----------------------------------
>> filenames <- Sys.glob("/Users/Desktop/Test/*.csv")  # get names of files
>> to
>> process # use * to get all
>>
>> variables <- data.frame(1:length(filenames)) # preallocate assuming
>> multiple
>> values from each file # creates a dataframe with the same length of rows
>> as
>> the number of .csv files to process
>>
>> for (i in seq_along(filenames)){
>>     input <- read.csv(filenames[i], header=TRUE, na.strings="NA")  
>>     data.frame("input")
>> 	attach(input)
>> 	
>> result.A <- x[2]*y[1]
>> result.B <- y[2]-x[1]
>> result.C <- x[3]+y[1]
>>
>> results <- c(result.A, result.B, result.C) # concatenate result vectors
>>
>> variables[i] <- results
>> } 
>>
>> variables <- as.data.frame(t(as.matrix(variables))) # turn result vectors
>> into a matrix, then transpose it and output as a data frame
>>
>> # add column and row names
>> c.names <- c("ResultA", "ResultB", "ResultC") # set names for result
>> vectors
>> colnames(variables) <- c.names
>> rownames(variables) <- filenames
>>
>> # export to csv file
>> write.csv(variables, file="/Users/Desktop/Test.csv") 
>> # ----------------------------------------END
>> R-CODE-----------------------------------
>>   
> I think something like this should work better:
> 
> docalc <- function(thisfile){
>     input <- read.csv(filenames[i], header=TRUE, na.strings="NA")
>     attach(input)
>     result.A <- x[2]*y[1]
>     result.B <- y[2]-x[1]
>     result.C <- x[3]+y[1]
>     results <- c(result.A, result.B, result.C) # concatenate result
> vectors
>     names(results) <- c("ResultA", "ResultB", "ResultC")    
> return(results)
> }
> 
> variables <- sapply(filenames,docalc)
> 
> -- 
> Levi Waldron
> post-doctoral fellow
> Jurisica Lab, Ontario Cancer Institute
> Division of Signaling Biology
> IBM Life Sciences Discovery Centre
> TMDT 9-304D
> 101 College Street
> Toronto, Ontario M5G 1L7
> (416)581-7453
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/Looping-multiple-output-values-to-dataframe-tp21981108p21984499.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list