[R] Sorting a Data Frame

Sarah Goslee sarah.goslee at gmail.com
Tue Jan 26 22:35:56 CET 2016


On Tue, Jan 26, 2016 at 4:24 PM, Robert Sherry <rsherry8 at comcast.net> wrote:
>
> Thank  you for the response. As expected, the following expression worked:
>     df[order(df$x),]

This says to sort the rows, and leave the columns alone.
Subsetting a 2-dimensional object is via
[rows, columns]

> I would expect the following expression to work also:
>         df[order(df$x)]

This does something a bit unexpected, and what it does depends on
whether you have a data frame or matrix.


> mydf <- data.frame(A=1:3, B=4:6)

> mydf[2, ] # row 2
  A B
2 2 5

> mydf[, 2] # col 2
[1] 4 5 6

> mydf[2]   # ???
  B
1 4
2 5
3 6

A data frame is "really" a list of columns, so giving a single value
returns that column.


> mymat <- as.matrix(mydf)
> mymat[2, ] # row 2
A B
2 5

> mymat[, 2] # col 2
[1] 4 5 6

> mymat[2]   # ???
[1] 2

But for a matrix, it returns that element, starting at the top left
and working down rows first.

So it's a really good idea to not subset your rectangular objects that
way, as it may eventually bite you.


> However it does not. That is, the comma is needed. Please tell me why the
> comma is there.
>
> Thanks
> Bob
> On 1/26/2016 8:19 AM, S Ellison wrote:
>>>
>>> On 23.01.2016 01:21, Robert Sherry wrote:
>>>>
>>>> In R, I run the following commands:
>>>>       df = data.frame( x=runif(10), y=runif(10) )
>>>>       df2 = df[order(x),]
>>>
>>> You use another x from your workspace, you actually want to
>>>
>>>
>>>    df2 = df[order(df[,"x"]),]
>>
>> or
>> df[order(df$x),]
>>
>> And just to prevent yet more confusion, you might also want to avoid 'df'
>> as a name. 'df' is the function that returns the density of the F
>> distribution ...
>>
>> S Ellison
>>
>>



More information about the R-help mailing list