[R] subsetting like in SAS

Petr Pikal petr.pikal at precheza.cz
Thu Jan 13 14:23:38 CET 2005


Hi Denis

maybe unique() can choose unique entries from your data set 
without need for sorting.

Cheers
Petr

On 13 Jan 2005 at 11:52, Denis Chabot wrote:

> Hi,
> 
> Being in the process of translating some of my SAS programs to R, I
> encountered one difficulty. I have a solution, but it is not elegant
> (and not pleasant to implement).
> 
> I have a large dataset with many variables needed to identify the
> origin of a sample, many to describe sample characteristics, others to
> describe site characteristics.
> 
> I want only a (shorter) list of sites and their characteristics.
> 
> If "origin", "ship_cat", "ship_nb", "trip" and "set" are needed to
> identify a site, in SAS you'd sort on those variables, then read the
> data with:
> 
> data sites;
>  set alldata;
>  by origin ship_cat ship_nb trip set;
>  if first.set;
>  keep list-of-variables-detailing-sites;
> run;
> 
> In R I did this with the Lag function of Hmisc, and the original data
> set also needs to be sorted first:
> 
> oL <- Lag(origin)
> scL <- Lag(ship_cat)
> snL <- Lag(ship_nb)
> tL <- Lag(trip)
> sL <- Lag(set)
> same <- origin==oL & ship_cat==scL & ship_nb==snL & trip==tL & set==sL
> sites <- subset(alldata, !same,
> select=c(list-of-variables-detailing-sites)
> 
> Could I do better than this?
> 
> Thanks in advance,
> 
> Denis Chabot
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal at precheza.cz




More information about the R-help mailing list