[R] Average of data files in a directory

Peter Perez plpd00 at gmail.com
Thu Jul 2 18:44:01 CEST 2009


Hi Henrique,

This works perfectly!. Thanks so much for your help.

Best regards,
Peter


Henrique Dallazuanna wrote:
> 
> Peter,
> 
> Try this:
> 
> apply(do.call(merge, c(loadfiles, by = "Time"))[2:3], 1, mean)
> 
> On Thu, Jul 2, 2009 at 10:39 AM, Peter Perez <plpd00 at gmail.com> wrote:
> 
>>
>> Hi,
>>
>> Thanks to Henrique and Prof. Spector for their kind reply. Indeed, the
>> mean
>> function does work, but the result I want is an average data vector, not
>> an
>> scalar, which is the output of mean(). This is, given these two data
>> frames
>> (each data frame has 500 rows):
>>
>> Time  Pressure
>> 0.0     100
>> 1.0     200
>> 2.0     300
>>
>> And
>> Time  Pressure
>> 0.0     200
>> 1.0     300
>> 2.0     400
>>
>> By computing the average, c = (a+b)/2, I would get:
>> Time  Pressure
>> 0.0     150
>> 1.0     250
>> 2.0     350
>>
>> For a few vectors I could simply use the formula c = (a+b)/2, but I have
>> about 40 data files for each directory and was looking for an automatic
>> way
>> to do this. Here is the code I have so far:
>>
>> listfiles <- list.files(pattern=".pre")    # list all datafiles in
>> thedirectory
>> loadfiles <- lapply(listfiles,read.csv)  # read the csv files
>> meanDataFrame <- lapply(loadfiles, ?????)
>>
>> Thanks again for your help...
>>
>> Best regards,
>> Peter Perez
>>
>> EMS Energy Institutute
>> The Pennsylvania State University
>> University Park, PA 16802
>>
>>
>>
>> The mean function works in data.frames:
>>
>> lapply(loadfiles, mean)
>> [[1]]
>>    Time Pressure
>>  1.0000 323.3333
>>
>> [[2]]
>>    Time Pressure
>>  1.0000 323.3333
>>
>>
>> On Wed, Jul 1, 2009 at 1:50 PM, plpd00 <plpd00 at gmail.com> wrote:
>>
>> >
>> > Dear all,
>> >
>> > I know it is as simple as c <- (a + b)/2 to compute the average
>> > (element-wise) of two data vectors. However, I can't work out to
>> compute
>> > the
>> > average when you have many data vectors in a directory. I have done
>> this:
>> > ------------------------------------
>> > setwd("/.../data/")
>> > listfiles <- list.files(pattern=".pre")    # list all datafiles in the
>> > directory
>> > loadfiles <- lapply(listfiles,read.csv)  # read the csv files
>> > meanData <- lapply(loadfiles,mean) # intended to compute the
>> element-wise
>> > average.
>> > ------------------------------------
>> > Which does not produce the right result. Let me explain myself
>> better...
>> > The
>> > data files have the form (500 rows):
>> > Time,Pressure
>> > 0,300
>> > 1,320
>> > 2,350
>> > And I want to compute the average time (which is trivial since all time
>> > vectors are the same) and the average pressure (from all the data files
>> > collected). I know the function "mean" is not the one to use, but I
>> cannot
>> > find any function that allows me to compute average of two vectors. I
>> hope
>> > this is clear enough for you guys to understand. Thanks in advance for
>> > your
>> > kind attention.
>> >
>> > Best regards,
>> > Peter
>> >
>> > --
>> > View this message in context:
>> >
>> http://www.nabble.com/Average-of-data-files-in-a-directory-tp24293290p24293290.html
>> > Sent from the R help mailing list archive at Nabble.com.
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>>
>>         [[alternative HTML version deleted]]
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Average-of-data-files-in-a-directory-tp24293290p24306925.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
> 
> 	[[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/Average-of-data-files-in-a-directory-tp24293290p24310128.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list