[R] how to calculate average of each column

arun smartpink111 at yahoo.com
Thu Apr 11 00:13:18 CEST 2013


Hi,
Try this:
set.seed(52)
dat1<- as.data.frame(matrix(sample(c(1:40,NA),100*60,replace=TRUE), nrow=600))
 res1<-as.data.frame(do.call(rbind,lapply(split(dat1,((seq_len(nrow(dat1))-1)%/% 60)+1),function(x) colMeans(x,na.rm=TRUE))))
 res1
#         V1       V2       V3       V4       V5       V6       V7       V8
#1  21.20339 21.10000 20.64407 20.94828 20.22034 17.91379 21.38983 20.00000
#2  21.48214 19.50000 19.41379 22.13793 18.53448 20.40000 18.94915 19.77193
#3  22.11864 20.03448 19.55932 20.61667 19.41379 21.49153 20.08333 21.44828
#4  21.70000 17.93333 22.32759 18.66667 21.61017 20.94828 19.13793 20.32203
#5  20.91667 19.37288 20.16949 18.12500 22.05172 23.01724 22.17241 19.22034
#6  19.50847 21.05085 20.70690 20.16667 22.22807 21.36207 21.63793 17.13793
#7  19.53333 22.66667 20.98305 20.96667 23.06780 20.98305 21.83051 19.91525
#8  20.36207 23.55932 20.94915 20.47458 21.25424 19.94828 19.31481 20.01695
#9  18.62069 22.03509 19.50847 18.95000 21.19298 23.01695 19.63333 20.44828
#10 21.96491 20.10000 21.61667 20.65000 17.76667 20.25000 18.28070 19.68966
 #        V9      V10
#1  21.01695 18.84746
#2  17.46552 18.93333
#3  20.69492 22.60000
#4  19.05263 20.30508
#5  21.73333 22.40678
#6  21.86207 21.33333
#7  20.81034 17.25000
#8  21.53333 21.45763
#9  22.18966 19.70000
#10 22.31579 20.58929
A.K.



________________________________
 From: Ye Lin <yelin at lbl.gov>
To: arun <smartpink111 at yahoo.com> 
Sent: Wednesday, April 10, 2013 6:02 PM
Subject: Re: [R] how to calculate average of each column
 

Hey A.K,


I want to exclude the missing values in the table when do the col mean, and I code like this:

res<-lapply(split(dat1,((seq_len(nrow(dat1))-1)%/% 60)+1),colMeans(dat1,na.rm=TRUE)


then i get this error message:

colMeans(dat1, na.rm = TRUE)' is not a function, character or symbol


How can I tell R to omit the NAs automatically and do the mean? For example, if there is 4 out of 10 NAs in one column, it will calculate mean as sum of the remaining 6 values and divide by 6.


Thanks!






On Wed, Apr 10, 2013 at 1:13 PM, arun <smartpink111 at yahoo.com> wrote:

Hi,
>TRy this:
>set.seed(52)
>dat1<- as.data.frame(matrix(sample(1:40,100*60,replace=TRUE), nrow=600))
>
>lapply(split(dat1,((seq_len(nrow(dat1))-1)%/% 60)+1),nrow)
>#$`1`
>#[1] 60
>
>#$`2`
>#[1] 60
>
>#$`3`
>#[1] 60
>
>res<-lapply(split(dat1,((seq_len(nrow(dat1))-1)%/% 60)+1),colMeans)
> res[1:2]
>#$`1`
>#      V1       V2       V3       V4       V5       V6       V7       V8
>#20.95000 20.53333 20.55000 21.13333 20.10000 18.33333 21.13333 20.50000
>#      V9      V10
>#20.86667 18.70000
>
>#$`2`
>#      V1       V2       V3       V4       V5       V6       V7       V8
>#22.16667 19.85000 19.73333 22.26667 18.80000 19.93333 18.85000 20.46667
>#      V9      V10
>#17.81667 18.51667
>
>A.K.
>
>
>
>
>----- Original Message -----
>From: Ye Lin <yelin at lbl.gov>
>To: r-help at r-project.org
>Cc:
>Sent: Wednesday, April 10, 2013 1:46 PM
>Subject: [R] how to calculate average of each column
>
>Hey All,
>
>I have a large dataset and I want to calculate the average of each column
>then return a new dataset.
>
>Here is my question: I dont know if there is a function that can allow me
>to calculate the average every 60 records of data in the whole dataset, and
>return a new data frame. Not sure if I have to divide the dataset first for
>every 60, then do the mean or can i directly do that.
>
>thanks for your help!
>
>
>cici
>
>    [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list