# [R] Subset and order

arun smartpink111 at yahoo.com
Sun Jul 7 16:51:05 CEST 2013

```Hi,
You could also try ?data.table()
1    2    3
3    3    4
2    4    5
1    3    4

library(data.table)

xt<- data.table(xt)
setkey(xt,a)
subset(xt,b==3)
#   a b c
#1: 1 3 4
#2: 3 3 4

iord <- order(x\$a)
subset(x[iord, ], b == 3)
#  a b c
#4 1 3 4
#2 3 3 4

Speed comparison:
set.seed(12345)
dat1<- as.data.frame(matrix(sample(1:10,3*1e7,replace=TRUE),ncol=3))
colnames(dat1)<-letters[1:3]
system.time({
iord <- order(dat1\$a)
res1<-subset(dat1[iord, ], b == 3)
})
#  user  system elapsed
#  6.888   0.296   7.202

dt1<- data.table(dat1)
system.time({setkey(dt1,a)
resdt1<-subset(dt1,b==3)})
# user  system elapsed
#   0.72    0.06    0.78

#   a b  c
#1: 1 3  6
#2: 1 3  4
#3: 1 3 10
#4: 1 3  2
#5: 1 3  9
#6: 1 3  8
#    a b  c
#75  1 3  6
#93  1 3  4
#300 1 3 10
#301 1 3  2
#437 1 3  9
#672 1 3  8

A.K.
----- Original Message -----
To: Noah Silverman <noahsilverman at ucla.edu>
Cc: "R-help at r-project.org" <r-help at r-project.org>
Sent: Friday, July 5, 2013 3:51 PM
Subject: Re: [R] Subset and order

Hello,

If time is one of the problems, precompute an ordered index, and use it
every time you want the df sorted. But that would mean you can't do it
in a single operation.

iord <- order(x\$a)
subset(x[iord, ], b == 3)

Em 05-07-2013 20:47, Noah Silverman escreveu:
> That would work, but is painfully slow.  It forces a new sort of the data with every query.  I have 200,000 rows and need almost a hundred queries.
>
> Thanks,
>
> -N
>
>
> On Jul 5, 2013, at 12:43 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>
>> Hello,
>>
>> Maybe like this?
>>
>> subset(x[order(x\$a), ], b == 3)
>>
>>
>> Hope this helps,
>>
>>
>> Em 05-07-2013 20:33, Noah Silverman escreveu:
>>> Hello,
>>>
>>> I have a data frame with several columns.
>>>
>>> I'd like to select some subset *and* order by another field at the same time.
>>>
>>> Example:
>>>
>>> a    b    c
>>> 1    2    3
>>> 3    3    4
>>> 2    4    5
>>> 1    3    4
>>> etc…
>>>
>>>
>>> I want to select all rows where b=3 and then order by a.
>>>
>>> To subset is easy:  x[x\$b==3,]
>>> To order is easy: x[order(x\$a),]
>>>
>>> Is there a way to do both in a single efficient statement?
>>>
>>> Thanks,
>>>
>>>
>>>
>>> --
>>> Noah Silverman, M.S., C.Phil
>>> UCLA Department of Statistics
>>> 8117 Math Sciences Building
>>> Los Angeles, CA 90095
>>>
>>>
>>>
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help