[R] newbie: new_data_frame <- selected set of rows

Darek Kedra darked90 at yahoo.com
Thu Nov 30 23:23:38 CET 2006


Hello,

this is probably trivial but I failed to find this
particular snippet of code.

What I got:
my_dataframe (contains say a 40k rows and 4 columns)
distances (vector with euclidean distances between a
query vector and each of the rows of my_dataframe)

What I do:
after scaling data my_dataframe I calculate distances.
order them then extract top five hits

my_dataframe  <- read.table("myDB.csv", header=F,
dec=".", sep=";",
row.names=1)
#reads the whole file

scaled_DB <- scale(my_dataframe, center=FALSE)
#scales the values

require(hopach)
#checks necessary R package

distances <- order(distancevector(scaled_DB,
scaled_DB['query',], d="euclid"))
#calculates distances and orders the results from
lowest

for(i in distances[1:5]) print( dbfile[i,])
#prints top five hits just for debugging
 
What I want to do:
1) create a small top_five frame
sadly this does not work:
for(i in distances[1:5]) top_five[i,] <-
my_dataframe[i,]

2) after I got top_five I woul like to get the index
of my query entry, something along Pythons 
top_five.index('query_string')

3) possibly combine values in distances with row names
from my_dataframe:
row_1 distance_from_query1
row_2 distance_from_query2

Thank you very much for your help

Darek Kedra



More information about the R-help mailing list