[R] summarize a vector

Michael Weylandt michael.weylandt at gmail.com
Sat Aug 11 00:42:36 CEST 2012


I wouldn't be surprised if one couldn't get an *apply-free solution by using diff(), cumsum() and selective indexing as well. 

Cheers,
Michael

On Aug 10, 2012, at 5:07 PM, David Winsemius <dwinsemius at comcast.net> wrote:

> 
> On Aug 10, 2012, at 12:57 PM, Bert Gunter wrote:
> 
>> ... or perhaps even simpler:
>> 
>>> sz <- function(x,k)tapply(x,(seq_along(x)-1)%/%k, sum)
>>> sz(1:10,3)
>> 0  1  2  3
>> 6 15 24 10
>> 
>> Note that this works for k>n, where the previous solution does not.
>>> sz(1:10,15)
>> 0
>> 55
> 
> I agree that it is more elegant, but I do not get an error or an unexpected result with my method.
> 
> > N=10
> > k=15
> > w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
> > w
> 1
> 55
> 
> A different label but the same result. I'm protected from the typical 1:0 problem that seq_along solves by including +1 in the second argument to ":"/seq(). Unless, of course, you set N to a negative number, but that wouldn't make much sense would it, and you get an error from rep() anyway.
> 
> Best;
> David.
> 
>> 
>> -- Bert
>> 
>> On Fri, Aug 10, 2012 at 12:37 PM, David Winsemius
>> <dwinsemius at comcast.net> wrote:
>>> 
>>> On Aug 10, 2012, at 12:20 PM, Sam Steingold wrote:
>>> 
>>>> I have a long numeric vector v (length N) and I want create a shorter
>>>> vector of length N/k consisting of sums of k-subsequences of v:
>>>> 
>>>> v <- c(1,2,3,4,5,6,7,8,9,10)
>>>> 
>>>> N=10, k=3
>>>> ===> [6,15,24,10]
>>>> 
>>>> I can, of course, iterate:
>>>> 
>>>>> w <- vector(mode="numeric",length=ceiling(N/k))
>>>>> for (i in 1:length(w)) w[i] <- sum(v(i*k:(i+1)*k))
>>>> 
>>>> 
>>>> (modulo boundary conditions)
>>>> but I wonder if there is a better way.
>>> 
>>> 
>>> Well, using v with parentheses instead of square-brackets might not be the
>>> right way, since v is not a function.
>>> 
>>> Consider this alternate (no need to pre-allocate 'w'):
>>> 
>>>> w <- tapply( v ,rep(1:(N/k +1), each=k, len=N ) , sum)
>>>> w
>>> 1  2  3  4
>>> 6 15 24 10
>>> 
>>> --
>>> 
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>> 
>> -- 
>> 
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> 
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> David Winsemius, MD
> Alameda, CA, USA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list