[R] Package or function for selecting matched pairs?

David Winsemius dwinsemius at comcast.net
Wed Feb 17 16:09:26 CET 2010


On Feb 17, 2010, at 8:55 AM, David Winsemius wrote:

>
> On Feb 17, 2010, at 8:26 AM, Ista Zahn wrote:
>
>> Hi all,
>> I am designing a study in which I am selecting a subset of college
>> courses to be randomly assigned to one of two conditions. I would  
>> like
>> to create matched pairs of courses, and then randomly assign them to
>> condition within each pair. I would like to identify, for each  
>> course,
>> the one that best matches it, and quantify how well it matches. Here
>> is a much simpler data set for purposes of illustration:
>>
>> ED <- data.frame(course = letters[1:5], level=factor(c(100, 100, 200,
>> 300, 200)), size = c(44, 12, 23, 124, 30), rating = c(4,5,5,3,5))
>>
>>>> course level size rating
>> 1      a   100   44      4
>> 2      b   100   12      5
>> 3      c   200   23      5
>> 4      d   300  124      3
>> 5      e   200   30      5
>
> > ED$grps <- paste(ED$level, cut(ED$size,  
> breaks=c(0,15,35,60,150,300)),
> + cut(ED$rating, breaks=c(0,2,4,6)), sep=".")
>
> > ED[order(ED$grps), ]
>  course level size rating               grps   # forgot to relabel
> 2      b   100   12      5   100.(0,15].(4,6]
> 1      a   100   44      4  100.(35,60].(2,4]
> 3      c   200   23      5  200.(15,35].(4,6]
> 5      e   200   30      5  200.(15,35].(4,6]
> 4      d   300  124      3 300.(60,150].(2,4]
>

I got an offlist request to explain this. I see that I fail to  
properly edit it so that the 5th column was named "grps". I had run  
the function originally to create ED$cuts, but thought that variable  
might be better called "grps" so edited the functions but failed to  
edit the output. If the question what how it works, then:

A) create grouping categories for each variable. I applied my domain  
knowledge to decide that levels ought to be "ungrouped" but that size  
and rating would be groups in categories that were determiend by the  
breaks argument to the cut functions.
B) paste the together with a "." period separator to create "grps"  
that  would be considered similar. This I believe is "blocking" in  
anova terminology and perhaps "strata" in regression terminology. The  
category labels become just parts of the bigger character string.

-- 
David.


>>
>> Basically I want a function that tells me that courses c and e  
>> "match"
>> so that I can treat them as a pair and randomly assign them to
>> condition. I've looked at the matching and MatchIt packages, but they
>> seem to need to know in advance which course is in the treatment
>> condition and which is in the control condition. I'll be grateful for
>> any suggestions.
>>
>> Best,
>> -- 
>> Ista Zahn
>> Graduate student
>> University of Rochester
>> Department of Clinical and Social Psychology
>> http://yourpsyche.org
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list