[R] elegant way to check if 2 values are in 3 columns?

Joanne Demmler J.Demmler at swansea.ac.uk
Fri Aug 26 16:16:11 CEST 2011


Dear all,

I'm trying to rerun some data linkage exercises in R (they are designed 
to be done in SPSS, SAS or STATA)
The exercise in question is to relabel the column "treat" to "1", if 
"yearsep" is smaller than 1988 and columns "proc1"-"proc3" contain the 
values 56.36 or 59.81.

My pathetic solution to do this in R currently looks like this:

vaslinks4$treat <- 0

vaslinks4$treat[vaslinks4$yearsep < 1988 && (vaslinks4$proc1 %in% 
c(56.36,59.81)
             || vaslinks4$proc2 %in% c(56.36,59.81)
             || vaslinks4$proc3 %in% c(56.36,59.81))] <- 1

But I'm sure there is a more elegant solution for this, in which I would 
not have to call all three columns separately.

Anyone?
Yours Joanne



Solution in SPSS:

COMPUTE treat=0.
FORMATS treat (F1).
DO REPEAT proc=proc1 to proc3.
DO IF (yearsep LT 1988).
IF (proc EQ 56.36 OR proc EQ 59.81) treat = 1.
END IF.
END REPEAT.

Solution in SAS:

do i = 1 to 3 until (treat > 0);
if yearsep < 1988 then do;
if procs{i} in (56.36, 59.81) then treat = 1;
else treat = 0;
end;

Solution in STATA:

generate treat=0
foreach x in proc1 proc2 proc3 {
recode treat(0=1) if ((`x'==56.36 | `x'==59.81) & yearsep<1988)
| ((`x'>=63.70 &`x'<=63.79) & yearsep>=1988)
}
tab treat


-- 
Joanne Demmler Ph.D.
Research Assistant
College of Medicine
Swansea University
Singleton Park
Swansea SA2 8PP
UK
tel: 		+44 (0)1792 295674
fax: 		+44 (0)1792 513430
email: 		j.demmler at swansea.ac.uk
DECIPHer:	www.decipher.uk.net



More information about the R-help mailing list