[R] How to reference or sort rownames in a data frame
ggrothendieck at gmail.com
Mon May 28 04:29:09 CEST 2007
On 5/27/07, Robert A. LaBudde <ral at lcfltd.com> wrote:
> As I was working through elementary examples, I was using dataset
> "plasma" of package "HSAUR".
> In performing a logistic regression of the data, and making the
> diagnostic plots (R-2.5.0)
> plasma_1<- glm(ESR ~ fibrinogen * globulin, data=plasma, family=binomial())
> I find that data points corresponding to rownames 17 and 23 are
> outliers and high leverage.
> I would then like to perform a fit without these two rows.
> In principle this should be easy, using an update() with subset=-c(17,23).
> The problem is that the rownames in this dataset are not ordered,
> and, in fact, the relevant rows are 30 and 31, not 17 and 23.
> This brings up the following (elementary?) questions:
> 1. How do you reference rows in "subset=" for which you know the
> rownames, but not the row numbers?
Use a logical vector:
rownames(plasma) %in% c(17, 23)
> 2. How do you discovery the rows corresponding to particular
> rownames? (Using plasma[rownames(plasma)==17,] shows the data, but
> NOT the row number!) (Probably the same answer as in Q. 1 above.)
which(rownames(plasma) %in% c(17, 23)) # 30, 31
> 3. How do you sort (order) the rows of an existing data frame so that
> the rownames are in order?
More information about the R-help