[R] ave(x, y, FUN=length) produces character output when x is character

Nordlund, Dan (DSHS/RDA) NordlDJ at dshs.wa.gov
Wed Dec 24 21:06:15 CET 2014


> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Mike
> Miller
> Sent: Wednesday, December 24, 2014 11:31 AM
> To: R-Help List
> Subject: [R] ave(x, y, FUN=length) produces character output when x is
> character
> 
> R 3.0.1 on Linux 64...
> 
> I was working with someone else's code.  They were using ave() in a way
> that I guess is nonstandard:  Isn't FUN always supposed to be a variant
> of
> mean()?  The idea was to count for every element of a factor vector how
> many times the level of that element occurs in the factor vector.
> 
> 
> gl() makes a factor:
> 
> > gl(2,2,5)
> [1] 1 1 2 2 1
> Levels: 1 2
> 
> 
> ave() applies FUN to produce the desired count, and it works:
> 
> > ave( 1:5, gl(2,2,5), FUN=length )
> [1] 3 3 2 2 3
> 
> 
> The elements of the first vector are irrelevant because they are only
> counted, so we should get the same result if it were a character
> vector,
> but we don't:
> 
> > ave( as.character(1:5), gl(2,2,5), FUN=length )
> [1] "3" "3" "2" "2" "3"
> 
> The output has character type, but it is supposed to be a collection of
> vector lengths.
> 
> 
> Two questions:
> 
> (1) Is that a bug in ave()?  It certainly is unexpected.
> 
> (2) What is the best way to do this sort of thing?
> 
> The truth is that we start with a character vector and we want to
> create
> an integer vector that tells us for every element of the character
> vector
> how many times that string occurs.  Here are two vectors of length 6
> that
> should give the same result:
> 
> > intvec <- c(4,5,6,5,6,6)
> > charvec <- c("A","B","C","B","C","C")
> 
> The code was used like this with integer vectors and it seemed to work:
> 
> > ave( intvec, intvec, FUN=length )
> [1] 1 2 3 2 3 3
> 
> When a character vector came along, it would fail by producing a
> character
> vector as output:
> 
> > ave( charvec, charvec, FUN=length )
> [1] "1" "2" "3" "2" "3" "3"
> 
> This seems more appropriate, and it might always work, but is it OK?:
> 
> > ave( rep(1, length(charvec)), as.factor(charvec), FUN=sum )
> [1] 1 2 3 2 3 3
> 
> I suspect that ave() isn't the best choice, but what is the best way to
> do
> this?
> 
> 
> Thanks in advance.
> 
> Mike

For your character vector example, this will get you the counts.

table(charvec)[charvec]


Hope this is helpful,

Dan

Daniel J. Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services




More information about the R-help mailing list