[R] Output of order() incorrectly ordered?

jim holtman jholtman at gmail.com
Tue Mar 25 10:20:35 CET 2008


works fine by me with the data you supplied:

> x
           V1          V2      V3                  V4
1 8.30440e-02 3.75276e+02  680220            majority
2 5.50816e-09 2.48914e-05   26377        conformation
3 1.69618e-04 7.66505e-01 1546938         interaction
4 3.90425e-05 1.76433e-01 1655338             vitamin
5 3.78182e-02 1.70900e+02 1510941               array
6 3.00359e-07 1.35732e-03   69421 oligo(dT)-cellulose
7 1.01517e-13 4.58754e-10  699918            elastase
> x[order(x$V1),]
           V1          V2      V3                  V4
7 1.01517e-13 4.58754e-10  699918            elastase
2 5.50816e-09 2.48914e-05   26377        conformation
6 3.00359e-07 1.35732e-03   69421 oligo(dT)-cellulose
4 3.90425e-05 1.76433e-01 1655338             vitamin
3 1.69618e-04 7.66505e-01 1546938         interaction
5 3.78182e-02 1.70900e+02 1510941               array
1 8.30440e-02 3.75276e+02  680220            majority
>

BTW, these two are not equivalent:

 > df_ordered <- df[order(df$V1), ]

OR, I assume equivalently,

 > df_ordered <- df[ do.call(order, df), ]

since you did not specify the column in the second case; you did not
indicate exactly which one was giving you problems.


On Mon, Mar 24, 2008 at 9:13 PM, Shirley Wu <shwu19 at stanford.edu> wrote:
> Hello,
>
> I have a data frame consisting of four columns and would like to sort
> based on the first column and then write the sorted data frame to a
> file.
>
>  > df <- read.table("file.txt", sep="\t")
> where file.txt is simply a tab-delimited file containing 4 columns of
> data (first 2 numeric, second 2 character). I then do,
>
>  > df_ordered <- df[order(df$V1), ]
>
> OR, I assume equivalently,
>
>  > df_ordered <- df[ do.call(order, df), ]
>
> and then,
>
>  > write.table(df_ordered, file="newfile.txt", ...)
>
> The input data file looks like this:
>
> 0.083044        375.276 680220  majority
> 5.50816e-09     2.48914e-05     26377   conformation
> 0.000169618     0.766505        1546938 interaction
> 3.90425e-05     0.176433        1655338 vitamin
> 0.0378182       170.9   1510941 array
> 3.00359e-07     0.00135732      69421   oligo(dT)-cellulose
> 1.01517e-13     4.58754e-10     699918  elastase
> ...
>
> I'd like the output file to look the same except sorted by the first
> column. The output of the commands above give me something that is
> sorted in some places but not sorted in others:
>
> [sorted section]
> ...
> 1.87276e-07     0.000846299     1142090 vitamin K
> 1.89026e-07     0.000854207     917889  leader peptide
> 1.90884e-07     0.000862605     31206   s
> 0.00536062      24.2246 1706420 prevent
> 5.42648e-05     0.245223        1513041 measured
> 5.42648e-05     0.245223        1513040 measured
> 0.019939        90.1044 12578   fly
> 0.00135512      6.12377 61688   GPI
> 0.00124421      5.62257 681915  content
> 0.0128271       57.9655 681916  estimated
> ...
> [sorted section]
> ...
> [unsorted section]
> ...
> [etc]
>
> I'm not sure if this is a problem with the input data or with order()
> or what. I am only doing this in R because many of my numeric values
> are expressed in exponential notation and UNIX sort does not handle
> this to my knowledge, but this behavior baffles me. I am pretty new
> to R so it's possible I'm missing something.
>
> Any insight would be greatly appreciated!
>
> Thanks,
> -Shirley
> graduate student
> Stanford University
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list