[R] Comparing elements for equality

Doran, Harold HDoran at air.org
Tue Jan 13 21:04:56 CET 2009


Nice. I was thinking maybe length(table(x) == 1), but this works great

> -----Original Message-----
> From: Carlos J. Gil Bellosta [mailto:cgb at datanalytics.com] 
> Sent: Tuesday, January 13, 2009 2:55 PM
> To: Doran, Harold
> Cc: r-help at r-project.org
> Subject: Re: [R] Comparing elements for equality
> 
> Hello,
> 
> You could build your output dataframe along the following lines:
> 
> foo <- function(x) length( unique(x) ) == 1
> 
> results <- data.frame(
> 	freq = tapply( dat$id,   dat$id, length ),
> 	var1 = tapply( dat$var1, dat$id, foo ),
> 	var2 = tapply( dat$var2, dat$id, foo )
> )
> 
> Best regards,
> 
> Carlos J. Gil Bellosta
> http://www.datanalytics.com
> 
> 
> On Tue, 2009-01-13 at 14:17 -0500, Doran, Harold wrote:
> > Suppose I have a dataframe as follows:
> > 
> > dat <- data.frame(id = c(1,1,2,2,2), var1 = 
> c(10,10,20,20,25), var2 = 
> > c('foo', 'foo', 'foo', 'foobar', 'foo'))
> > 
> > Now, if I were to subset by id, such as:
> > 
> > > subset(dat, id==1)
> >   id var1 var2
> > 1  1   10  foo
> > 2  1   10  foo
> > 
> > I can see that the elements in var1 are exactly the same and the 
> > elements in var2 are exactly the same. However,
> > 
> > > subset(dat, id==2)
> >   id var1   var2
> > 3  2   20    foo
> > 4  2   20 foobar
> > 5  2   25    foo
> > 
> > Shows the elements are not the same for either variable in this 
> > instance. So, what I am looking to create is a data frame 
> that would 
> > be like this
> > 
> > id	freq	var1	var2
> > 1	2	TRUE	TRUE	
> > 2	3	FALSE	FALSE
> > 
> > Where freq is the number of times the ID is repeated in the 
> dataframe. 
> > A TRUE appears in the cell if all elements in the column 
> are the same 
> > for the ID and FALSE otherwise. It is insignificant which values 
> > differ for my problem.
> > 
> > The way I am thinking about tackling this is to loop through the ID 
> > variable and compare the values in the various columns of 
> the dataframe.
> > The problem I am encountering is that I don't think all.equal or 
> > identical are the right functions in this case.
> > 
> > So, say I was wanting to compare the elements of var1 for id ==1. I 
> > would have
> > 
> > x <- c(10,10)
> > 
> > Of course, the following works
> > 
> > > all.equal(x[1], x[2])
> > [1] TRUE
> > 
> > As would a similar call to identical. However, what if I 
> only have a 
> > vector of values (or if the column consists of names) that 
> I want to 
> > assess for equality when I am trying to automate a process over 
> > thousands of cases? As in the example above, the vector may contain 
> > only two values or it may contain many more. The number of 
> values in 
> > the vector differ by id.
> > 
> > Any thoughts?
> > 
> > Harold
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 




More information about the R-help mailing list