[R] Problem

Mon Apr 29 22:14:56 CEST 2002

  Assuming that your data below are located in space-separated files
"temp1.dat" and "temp2.dat", here's a reasonable solution ...

m1 <- read.table("temp1.dat",fill=TRUE,as.is=TRUE)
## makes a table of factors; not necessarily optimal but ...
m2 <- read.table("temp2.dat")

qvals <- levels(m2[[3]])
## take only rows with 1st column included in column 3 of m2
m1 <- m1[m1[,1] %in% qvals,]
relnodes <- apply(m1,1,function(x)sum(x %in% qvals))
m1 <- m1[relnodes>1,]  ## keep only rows with at least one extra

## turn m1 into a list
m1b <- as.list(as.data.frame(t(as.matrix(m1))))
m1c <- lapply(m1b,function(x)as.character(x[x %in% qvals]))  ## drop elements not in qvals

On Mon, 29 Apr 2002, Ambrosini Alessandro wrote:

> Hello!
> This is the situation.
> I have a file in wich there is a scattered matrix. I give an example:
>
> aa bb cc
> bb xx
> dd cv st rw
> xx yu de qw ww zzp
>
> where aa is a node that has a path with aa, one with bb, and one with cc.
> bb has a path with xx, dd has a path with cv, one with st, ...
> In my experiment I have more or less 100 lines. My nodes are Web pages.
> I have another file that gives me a wich nodes are related with the query1
>
> query1 0 aa 1
> query1 0 cc 0
> query1 0 dd 0
> query1 0 cv 1
> query1 0 rw 0
> query1 0 qw 0
> query1 0 ww 1
>
> The query1 is not related with all the nodes of the scattered matrix but
> only with someone. The third column gives me which node is related with
> query1. This is the most important column.
> The second column with all "0" is a vector of costant that is useless. The
> fourth column tells me if the node considered in the second column is
> relevant to the query or not and also this column is not important for my
> work.
> Now, considering the query1, I want to obtain a new scattered matrix where
> only the nodes related with query1 appears. Starting from the example a have
> to obtain:
>
> aa cc
> dd cv rw
>
> The steps to do are: take the first line of the scattered matrix. If the
> first node (of the first column) of the row doesn't appear in the list of
> the nodes related to query, do not consider this line. If the node appears,
> then look if at least  another node in the row is related with query1. Write
> the lines with all the nodes related with query1, without writing the nodes
> that are not related with query1.
> Take the second line and do the same things...
>
> Summarizing: starting from the first matrix I want to use the third column
> of the second matrix to obtain a new one that contains only the nodes that
> appear in the third column.
> The output must be a matrix that has at least two elements for row.
> The most important node is in the first column and so if it doesn't match
> with the third column of query1, the row is not considered.
> For example if I have a b c in a row of the matrix and "a" is not a node
> related with query, the row must not be considered.
>
> If a solve this problem I'm in a good point of my thesis.
> Thank you very much. Excuseme for my english.
> Alessandro Ambrosini
>
>
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
318 Carr Hall                                bolker at zoo.ufl.edu
Zoology Department, University of Florida    http://www.zoo.ufl.edu/bolker
Box 118525                                   (ph)  352-392-5697
Gainesville, FL 32611-8525                   (fax) 352-392-3704

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._