[R] labels and counting

Spencer Graves spencer.graves at pdf.com
Thu Dec 30 21:07:06 CET 2004


      Have you done a search of "www.r-project.org" -> search -> "R site 
search" for "Markov Chain"?  I just got "138 documents matching your 
query".  The fifth one suggested "chapter 5 of Jim Lindsey's online 
document 'The statistical analysis of stochastic processes in Time', at 
his website www.luc.ac.be/~jlindsey".  I found this document mentioned 
under "recent publications".  The book may no longer be downloadable , 
but his examples still are. 

      There are probably other tools of interest to you in that list, 
and perhaps someone else will enlighten both of us on this. 

      There may be an easier way to do what you ask, if I understand 
your question correctly, the following seems to do it for me: 

bases <- c("A","C","G","T")
sgn <- c("+", "-")

signedBases <- as.vector(
     outer(bases, sgn, paste, sep=""))
sBnum <- 1:8
names(sBnum) <- signedBases
set.seed(1)
seqLen <- 100
sBaseSeq <- sample(x=signedBases,
            size=seqLen, replace=TRUE)

nextBase <- aggregate(sBaseSeq[-seqLen],
      list(thisBase=sBaseSeq[-seqLen],
           nextBase=sBaseSeq[-1]), length)
transFreq <- array(0, dim=c(8,8))
dimnames(transFreq) <- list(signedBases,
                            signedBases)
nBnum <- array(
    sBnum[as.matrix(nextBase[1:2])],
               dim=dim(nextBase[1:2]))

transFreq[nBnum]<- nextBase[[3]]

 > transFreq
   A+ C+ G+ T+ A- C- G- T-
A+  1  2  1  2  0  2  0  1
C+  2  3  1  0  0  3  1  1
G+  0  0  2  5  2  1  2  0
T+  1  2  2  1  1  3  8  2
A-  0  0  0  1  1  1  1  1
C-  2  1  1  5  0  2  2  2
G-  3  1  2  4  2  2  1  2
T-  0  2  2  2  0  1  2  1

      hope this helps.  spencer graves

dax42 wrote:

> Hello,
>
> I have got the following problem:
> given is a large string sequence consisting of the four letters "A" 
> "C" "G" and "T" (as before). Additionally, I have got a second string 
> sequence of the same length giving a label for each character. The 
> labels are "+" and "-".
>
> Now I would like to create an 8x8 matrix which contains the numbers on 
> how often we see all possible pairwise combinations, for example "A" 
> with the label "+" followed by "C" with the label "+" or "T"->"C" with 
> the labels "-"->"+" etc.
>
> Of course I can just use loops to "walk" along the sequence, but as 
> you have shown me so much better solutions in response to my last 
> mail, I thought you might be able to help and improve my R skills even 
> further ..
>
> Thanks for your ideas!
> Cheers, Winnie
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html


-- 
Spencer Graves, PhD, Senior Development Engineer
O:  (408)938-4420;  mobile:  (408)655-4567




More information about the R-help mailing list