[R] Excluding fixed number of rows from calculation while summarizing using ddply() function.

siddu479 onlyfordigitalstuff at gmail.com
Sun Nov 4 15:40:54 CET 2012


Hello All,

   I have a .csv file( contents shown) below, where I need to calculate
mean(for example) for only the rows highlighted in bold. (i.e. in this
example case I need to exclude the first row and last row(N=1) for each
"*StepNo*" column)

Unique,StepNo,Data1,Data2    #In actual file I have 100 columns and nearly
millions of rows.
A,1,4,5           #Exclude this 1st row for this "StepNo" and "Unique"
combination.
*A,1,5,6 *
A,1,7,8           #Exclude this last row for this "StepNo" and "Unique"
combination.
A,2,9,10         #Exclude this row because this 1st row for this "StepNo"
and "Unique" combination. 
*A,2,45,25*
A,2,10,11      #Exclude this last row for this "StepNo" and "Unique"
combination.
B,2,34,12      #Exclude this 1st row for this "StepNo" and "Unique"
combination. 
*B,2,5,6
B,2,7,8*
B,2,6,7           #Exclude this last row for this "StepNo" and "Unique"
combination.
B,3,1,2           #Exclude this 1st row for this "StepNo" and "Unique"
combination.
*B,3,3,4*
B,3,4,5          #Exclude this last row for this "StepNo" and "Unique"
combination.

My existing code to calculate mean* for all rows* is 
dat <- read.csv("aboveinput.csv", header=T) #Loading Input file
library("plyr")   
*result <- ddply(dat, .(Unique,StepNo), numcolwise(mean))*   # Calculating
mean for each Unique and StepNo combination and summarizing the results.

*I need to modify the above script to exclude some "N number of rows at the
start as well as at the end of each StepNo"*
Something like result <- ddply(dat, .(Unique,StepNo),numcolwise(mean(head n
rows excluded, tail n rows excluded in each StepNo)))  #Just a skeleton
script.

Please revert to me if my question is not clear.







-----
Sidda
Business Analyst Lead
Applied Materials Inc.

--
View this message in context: http://r.789695.n4.nabble.com/Excluding-fixed-number-of-rows-from-calculation-while-summarizing-using-ddply-function-tp4648406.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list