[R] How to get row numbers of a subset of rows

affy snp affysnp at gmail.com
Wed Nov 14 18:52:39 CET 2007


Thanks a lot, Jim and Bert. It worked pretty well.

Best,
     Allen

On Nov 14, 2007 12:11 PM, jim holtman <jholtman at gmail.com> wrote:
> That works for the specific value of '1', but you would have to repeat
> it for other values in the column.  If you had 100 different ranges in
> that column, what would you do?  Here is another solution using
> 'range' on the same data:
>
> > tapply(seq_len(nrow(x)), x$Chromosome, range)
> $`1`
> [1] 1 6
>
> $`2`
> [1]  7 10
>
>
>
> On Nov 14, 2007 12:04 PM, Bert Gunter <gunter.berton at gene.com> wrote:
> > Am I missing something? ...
> >
> > Why not: range(seq(nrow(B))[B[,2]==1] ) ?? ## note: "==" not "="
> >
> > Alternatively, and easily generalized (to start with a frame which is a
> > subset of the original and any subset of rows, contiguous or not)
> >
> > range(as.numeric(row.names(B)[B[,2]==1]))
> >
> > Again, am I missing something that makes this "obvious" solution impossible?
> > (Wouldn't be the first time.)
> >
> > Bert Gunter
> > Genentech Nonclinical Statistics
> >
> >
> >
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> > Behalf Of jim holtman
> > Sent: Wednesday, November 14, 2007 8:39 AM
> > To: affy snp
> > Cc: r-help at r-project.org
> > Subject: Re: [R] How to get row numbers of a subset of rows
> >
> > Here is a way of doing it using 'rle':
> >
> > > x <- read.table(textConnection("     SNP                Chromosome
> > PhysicalPosition
> > + 1 SNP_A-1909444          1           7924293
> > + 2 SNP_A-2237149          1           8173763
> > + 3 SNP_A-4303947          1           8191853
> > + 4 SNP_A-2236359          1           8323433
> > + 5 SNP_A-2205441          1           8393263
> > + 6 SNP_A-1909445          1           7924293
> > + 7 SNP_A-2237146          2           8173763
> > + 8 SNP_A-4303946          2           8191853
> > + 9 SNP_A-2236357          2           8323433
> > + 10 SNP_A-2205442         2           8393263"), header=TRUE)
> > > # use rle to get the 'runs'
> > > y <- rle(x$Chromosome)
> > > # create dataframe with start/ends and values
> > > start <- head(cumsum(c(1, y$lengths)), -1)
> > > index <- data.frame(values=y$values, start=start, end=start + y$lengths -
> > 1)
> > >
> > > index
> >  values start end
> > 1      1     1   6
> > 2      2     7  10
> > >
> >
> >
> > On Nov 14, 2007 10:56 AM, affy snp <affysnp at gmail.com> wrote:
> > > Hello list,
> > >
> > > I read in a txt file using
> > >
> > > <B<-read.table(file="data.snp",header=TRUE,row.names=NULL)
> > >
> > > by specifying the row.names=NULL so that the rows are numbered.
> > > Below is an example after how the table looks like using
> > > <B[1:10,1:3]
> > >
> > >
> > >      SNP                Chromosome  PhysicalPosition
> > > 1 SNP_A-1909444          1           7924293
> > > 2 SNP_A-2237149          1           8173763
> > > 3 SNP_A-4303947          1           8191853
> > > 4 SNP_A-2236359          1           8323433
> > > 5 SNP_A-2205441          1           8393263
> > > 6 SNP_A-1909445          1           7924293
> > > 7 SNP_A-2237146          2           8173763
> > > 8 SNP_A-4303946          2           8191853
> > > 9 SNP_A-2236357          2           8323433
> > > 10 SNP_A-2205442         2           8393263
> > >
> > > I am wondering if there is a way to return the start and end row numbers
> > > for a subset of rows.
> > >
> > > For example, If I specify B[,2]=1, I would like to get
> > > start=1 and end=6
> > >
> > > if B[,2]=2, then start=7 and end=10
> > >
> > > Is there any way in R to quickly do this?
> > >
> > > Thanks a bunch!
> > >
> > > Allen
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem you are trying to solve?
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
>



More information about the R-help mailing list