[R] Text data

jim holtman jholtman at gmail.com
Wed Jan 28 22:18:41 CET 2009


This will sort on those characters:

> x <- readLines(textConnection("26M_AN_C.bmp
+ 22M_AN_C.bmp
+ 20M_HA_O.bmp
+ 20M_AN_C.bmp
+ 26M_HA_O.bmp
+ 22M_HA_O.bmp
+ 31M_AN_C.bmp
+ 38M_HA_O.bmp"))
> closeAllConnections()
> # pick off characters between "_"
> sortKey <- sub(".*_(.+)_.*", "\\1", x)
> sortKey
[1] "AN" "AN" "HA" "AN" "HA" "HA" "AN" "HA"
> # output sorted list
> x[order(sortKey)]
[1] "26M_AN_C.bmp" "22M_AN_C.bmp" "20M_AN_C.bmp" "31M_AN_C.bmp"
"20M_HA_O.bmp" "26M_HA_O.bmp" "22M_HA_O.bmp" "38M_HA_O.bmp"
>
>


On Wed, Jan 28, 2009 at 3:37 PM, Alice Lin <alice.ly at gmail.com> wrote:
>
> i have a data column of text entries:
> 26M_AN_C.bmp
> 22M_AN_C.bmp
> 20M_HA_O.bmp
> 20M_AN_C.bmp
> 26M_HA_O.bmp
> 22M_HA_O.bmp
> 31M_AN_C.bmp
> 38M_HA_O.bmp
> .
> .
> .
> .
>
>
> And I would like to sort by the middle tag: AN, HA, etc.
> Is there a way to parse text data in R?
>
> In excel, I would have used the "left" and "right" function to cut out just
> the middle two letters out and put into another column to sort by.
>
> Thanks!
>
> --
> View this message in context: http://www.nabble.com/Text-data-tp21714334p21714334.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list