[R] Average 2 Columns when possible, or return available value

Phil Spector spector at stat.berkeley.edu
Sat Jun 26 01:15:30 CEST 2010


Eric -
   What you're describing is taking the mean of each row while
ignoring missing values:

> apply(DF,1,mean,na.rm=TRUE)
  [1]  22.60    NaN    NaN    NaN    NaN    NaN    NaN    NaN 102.00  19.20
[11]  19.20    NaN    NaN    NaN  11.80   7.62    NaN    NaN    NaN    NaN
[21]    NaN  75.00    NaN  18.00    NaN  12.90

   If this isn't suitable for your larger problem, please describe that
problem in greater detail.

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu


On Fri, 25 Jun 2010, emorway wrote:

>
> Forum,
>
> Using the following data:
>
> DF<-read.table(textConnection("A B
> 22.60 NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> 102.00 NA
> 19.20 NA
> 19.20 NA
> NA NA
> NA NA
> NA NA
> 11.80 NA
> 7.62 NA
> NA NA
> NA NA
> NA NA
> NA NA
> NA NA
> 75.00 NA
> NA NA
> 18.30 18.2
> NA NA
> NA NA
> 8.44 NA
> 18.00 NA
> NA NA
> 12.90 NA"),header=T)
> closeAllConnections()
>
> The second column is a duplicate reading of the first column, and when two
> values are available, I would like to average column 1 and 2 (example code
> below).  But if there is only one reading, I would like to retain it, but I
> haven't found a good way to exclude NA's using the following code:
>
> t(as.matrix(aggregate(t(as.matrix(DF)),list(rep(1:1,each=2)),mean)[,-1]))
>
> Currently, row 24 is the only row with a returned value.  I'd like the
> result to return column "A" if it is the only available value, and average
> where possible.  Of course, if both columns are NA, NA is the only possible
> result.
>
> The result I'm after would look like this (row 24 is an avg):
>
> 22.60
>    NA
>    NA
>    NA
>    NA
>    NA
>    NA
>    NA
> 102.00
> 19.20
> 19.20
>    NA
>    NA
>    NA
> 11.80
>  7.62
>    NA
>    NA
>    NA
>    NA
>    NA
> 75.00
>    NA
> 18.25
>    NA
>    NA
>  8.44
> 18.00
>    NA
> 12.90
>
> This is a small example from a much larger data frame, so if you're
> wondering what the deal is with list(), that will come into play for the
> larger problem I'm trying to solve.
>
> Respectfully,
> Eric
> -- 
> View this message in context: http://r.789695.n4.nabble.com/Average-2-Columns-when-possible-or-return-available-value-tp2269049p2269049.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list