[R] Sorting Text Frames

Murray Jorgensen maj at waikato.ac.nz
Wed Sep 7 05:45:55 CEST 2005


[Using 2.0.1 under Windows XP]
There are a few pages on the internet that list equivalents of
"thank you" in many languages. I downloaded one from a Google search
and I thought that it would be interesting and a good R exercise to
sort the file into the order of the expressions, rather than the languages.

I tidied up the web page and got it into the format that it was nearly
in: Language Name in columns 1-43, the expression in the remaining
columns.

Then I read it in:

 > thanks <- read.fwf("C:\\Files\\Reading\\thankyou.txt", c(43,37))
 > thanks[1:4,]
                                            V1            V2
1 Abenaki (Maine USA, Montreal Canada)            Wliwni ni
2 Abenaki (Maine USA, Montreal Canada)               Wliwni
3 Abenaki (Maine USA, Montreal Canada)               Oliwni
4 Achí (Baja Verapaz Guatemala)               Mantiox chawe

 > dim(thanks)
[1] 1254    2

Now I tried sorting the frame into the order of the second column:

tord <- order(thanks$V2)
sink("C:\\Files\\Reading\\thanks.txt")
thanks[tord[1:74],]
sink()

This gives more or less the expected output, the file thanks.txt beginning

                                                   V1 
    V2
145      Cahuila (United States)                                '\301cha-ma
862      Paipai (Mexico, USA)                                    'Ara'ya:ikm
863      Paipai (Mexico, USA)                                    'Ara'yai:km
864      Paipai (Mexico, USA)                                     'Ara'ye:km
311      Eyak (Alaska)                                            'Awa'ahdah

[you may get a bit of wrapping there!]

However I don't really want just 74 lines, I would like the whole file. But
if I get rid of the [1:74] or replace 74 with any larger number I get 
output
like this, with no second column:

                                                   V1
145      Cahuila (United States)
862      Paipai (Mexico, USA)
863      Paipai (Mexico, USA)
864      Paipai (Mexico, USA)
311      Eyak (Alaska)

Does anyone know what is going on?
Tusen tak in advance, in fact 1254 tak in advance!

Murray Jorgensen
-- 
Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz                                Fax 7 838 4155
Phone  +64 7 838 4773 wk     Home +64 7 825 0441   Mobile 021 1395 862




More information about the R-help mailing list