[R] reading in multiple data sets in 2 loops

William Dunlap wdunlap at tibco.com
Sun Feb 7 00:27:24 CET 2016

    I tried the following but it does not work:

    data <- lapply(
    read.csv, header=TRUE, sep=',' )
    names(data) <- paste("d", LETTERS[1:3], sep='')

I tried that and R complained about syntax errors - unexpected commas,
mismatched parentheses, illegal square brackets, etc.

Using lapply like this a perfectly fine way to solve  the problem but you
need to get the details right.  I find it easier to break  that statement
into parts and make sure each part is working.  E.g., after a minimal
cleanup of your code the file names would be computed as
    fileNames <-  paste("C:/Research3/simulation1/second_gen/pheno_
1000ind_4000m_add_h70_prog_", 1:2 ,"_", 2:3 ,".csv",sep='')
    print(fileNames) # do they look right?  You said you wanted 1_2, 1_3,
2_3 but that will give you only 2 of them
or perhaps you want all the files in that directory with a given pattern
    fileNames <- dir("C:/Research3/simulation1/second_gen",
full.names=TRUE, ignore.case=TRUE)
    head(fileNames) # keep at it until the fileNames list looks good

Then read the data from the files with
    data <- lapply(fileNames, read.csv, header=TRUE, sep=",")
If there are errors reading the files in csv format you could try
    data <- lapply(fileNames, function(fileName) { cat(fileName, "\n");
read.csv(fileName, header=TRUE, sep=",")}
so you can see the name of the first offending file.

When you attach names you probably want to get the names from the fileNames
variable, perhaps just the digits part
    names(data) <- gsub("^.*([[:digit:]]+_[[:digit:]]+)\\.csv$", "d_\\1",

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Feb 5, 2016 at 9:53 PM, Reka Howard <howardr at iastate.edu> wrote:

> Hello,
> I have over 1000 csv data sets I need to read into R, so I want to read
> them in using a loop. The data sets are named as
> pheno_1000ind_4000m_add_h70_prog_1_2.csv,
> pheno_1000ind_4000m_add_h70_prog_1_3.csv, ... so I need 2 loops (for the
> last 2 numbers in the names). What I would like to do is the following:
> setwd("C:/Research3/simulation1/second_gen")
> d1<-read.csv("pheno_1000ind_4000m_add_h70_prog_1_2.csv")
> d2<-read.csv("pheno_1000ind_4000m_add_h70_prog_1_3.csv")
> d3<-read.csv("pheno_1000ind_4000m_add_h70_prog_2_3.csv")
> .
> .
> .
> I am wondering how I can accomplish this with a loop. Any suggestion is
> appreciated!
> I tried the following but it does not work:
> data <- lapply(
>  paste(("C:/Research3/simulation1/second_gen/pheno_1000ind_4000m_add_h70_prog_",[1:2],"_",[2:3],".csv",sep=''),
> read.csv, header=TRUE, sep=',' )
> names(data) <- paste("d", LETTERS[1:3], sep='')
> Thanks!
> Reka
>         [[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]

More information about the R-help mailing list