# [R] summarize a vector

Bert Gunter gunter.berton at gene.com
Sat Aug 11 01:06:17 CEST 2012

```Certainly ... but this is of course limited to the few C coded
functions available. Back to apply-type stuff for, say, median as a
summary statistic.

-- Bert

On Fri, Aug 10, 2012 at 3:58 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Aug 10, 2012, at 3:42 PM, Michael Weylandt wrote:
>
>> I wouldn't be surprised if one couldn't get an *apply-free solution by
>> using diff(), cumsum() and selective indexing as well.
>
>
> What about colSums on a matrix extended with the right number of zeros.
>
>> colSums(matrix (c(v, rep(0, 3- length(v)%%3) ) , nrow=3) )
> [1]  6 15 24 10
>
> (My experience is that tapply is generally fairly fast anyway, much faster
> than apply.data.frame. So I do not lump all *apply solutions in the same
> efficiency category.)
>
> --
> David.
>>
>>
>> Cheers,
>> Michael
>>
>> On Aug 10, 2012, at 5:07 PM, David Winsemius <dwinsemius at comcast.net>
>> wrote:
>>
>>>
>>> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
>>>
>>>> ... or perhaps even simpler:
>>>>
>>>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>>>> sz(1:10,3)
>>>>
>>>> 0  1  2  3
>>>> 6 15 24 10
>>>>
>>>> Note that this works for k>n, where the previous solution does not.
>>>>>
>>>>> sz(1:10,15)
>>>>
>>>> 0
>>>> 55
>>>
>>>
>>> I agree that it is more elegant, but I do not get an error or an
>>> unexpected result with my method.
>>>
>>>> N=10
>>>> k=15
>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>> w
>>>
>>> 1
>>> 55
>>>
>>> A different label but the same result. I'm protected from the typical 1:0
>>> problem that seq_along solves by including +1 in the second argument to
>>> ":"/seq(). Unless, of course, you set N to a negative number, but that
>>> wouldn't make much sense would it, and you get an error from rep() anyway.
>>>
>>> Best;
>>> David.
>>>
>>>>
>>>> -- Bert
>>>>
>>>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>>>> <dwinsemius at comcast.net> wrote:
>>>>>
>>>>>
>>>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>>>>
>>>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>>>>
>>>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>>>>
>>>>>> N=10, k=3
>>>>>> ===> [6,15,24,10]
>>>>>>
>>>>>> I can, of course, iterate:
>>>>>>
>>>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>>>>
>>>>>>
>>>>>>
>>>>>> (modulo boundary conditions)
>>>>>> but I wonder if there is a better way.
>>>>>
>>>>>
>>>>>
>>>>> Well, using v with parentheses instead of square-brackets might not be
>>>>> the
>>>>> right way, since v is not a function.
>>>>>
>>>>> Consider this alternate (no need to pre-allocate 'w'):
>>>>>
>>>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>>>> w
>>>>>
>>>>> 1  2  3  4
>>>>> 6 15 24 10
>>>>>
>>>>> --
>>>>>
>>>>> David Winsemius, MD
>>>>> Alameda, CA, USA
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Bert Gunter
>>>> Genentech Nonclinical Biostatistics
>>>>
>>>> Internal Contact Info:
>>>> Phone: 467-7374
>>>> Website:
>>>>
>>>
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
>
> David Winsemius, MD
> Alameda, CA, USA
>

--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website: