[R] select data

R. Michael Weylandt michael.weylandt at gmail.com
Mon May 14 18:39:44 CEST 2012


This was actually discussed about a week and a half ago with many good
solutions offered, but I think the most idiomatic would be something
like this:

apply(dataset, 1, function(x) mean(x[x>0]))

The reasons I like it:

i) It uses the apply function to do the same operation row-wise
(that's what the "1" does) to all elements of your data set -- since
this is side-effect free (as a good functional language should be) it
makes for easy parallelization if you move to "big data"
ii) It uses an anonymous function (the "function ... " bit) which are
first class objects in R and can be passed as arguments to other
functions (here apply())
iii) It uses logical subscripting to pick out the values greater than
zero -- I think the subscripting behavior is the very best bit of R

Best,

Michael

On Mon, May 14, 2012 at 12:32 PM, Andrea Sica <aerdna.sica at gmail.com> wrote:
> Dear all,
>
> I am sure it won't be difficult for you!!
> I need to calculate the average among variables for the single units of my
> dataset.
> But, while doing it, I need to do not consider some values.
> To better explain, think like there are two units and three variables:
>
>      V1    V2     V3
> [1]   3     -2      4
> [2]  -1      4      1
>
> and you want to calculate the average by row, without considering those
> negative values:
>
> => mean(1row) = (3+4)/2
> => mean(2row) = (4+1)/2
>
> Could anyone please give me the commands to do that in R?
>
> Thank you so much
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list