[BioC] cghArray analysis plotting :need help please

Fri Jun 3 19:14:59 CEST 2011

Hi Nathalie --

On 06/03/2011 08:52 AM, Nathalie Conte wrote:
> HI, I a new in R and bioconductor and need somebody' s input on these
> issues ( see below, 2 questions), I am sorry about the rather long

It's always daunting to jump in to the middle of someone's problem; it 
helps to have a working, real example. I took a look at the example in 
?segment and came up with

   library(DNAcopy)
   set.seed(25)
   genomdat <- rnorm(500, sd=0.1) +
     rep(c(-0.2,0.1,1,-0.5,0.2,-0.5,0.1,-0.2),
         c(137,87,17,49,29,52,87,42))
   chrom <- rep(1:2, c(290,210))
   maploc <- c(1:290, 1:210)
   plot(segment(CNA(genomdat, chrom, maploc)))

Is that the right starting point? some more below...

> email, I am trying to put as much info as I can
> first my system:
>  > sessionInfo()
> R version 2.11.1 (2010-05-31)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=C
> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] DNAcopy_1.24.0
>
> loaded via a namespace (and not attached):
> [1] tools_2.11.1
>
>
> I am trying to plot some cgharray data (mouse agilent 244) and I need to
> make plots.
> I went for the dNAcopy package, following normalisation using limma, I
> am first creating a copy number array using CNA function from DNAcopy, I
> have used this script (based on the cghMCR package pdf
> instructions).Ma_betAr is my normalised data.
>  >chrom <- gsub("chr([0-9XY]+):.*", "\\1", Ma_BetAr$genes[,
> "SystematicName"])

probably 'sub' instead of 'gsub'

>  >dropMe <- c(which(!chrom %in% c(1:19, "X", "Y")),
> which(Ma_BetAr$genes[, "ControlType"] != 0))
>
>  >cna <- CNA(Ma_BetAr$M[-dropMe, ],
> gsub("chr([0-9XY]+):.*", "\\1", Ma_BetAr$genes[-dropMe, "SystematicName"]),
> as.numeric(gsub(".*:([0-9]+)-.*", "\\1",
> Ma_BetAr$genes[-dropMe, "SystematicName"])),
> data.type = "logratio", sampleid = colnames(Ma_BetAr$M))
>
>  >smooth.cna<-smooth.CNA(cna) smoothing the cna
> then this smoothed CNA is segment using the segment function
>  >segData <- segment(smooth.CNA(cna))
>
> then this segData is plotted using plot(segData, plot.type = "w") .
> At this point there is a plot(log2R in Yaxis and chromosome in X) which
> is created that will alternate the chromosomes alternate in different
> colours and are sorted in this order: 1 then 10 then 11, 12, 13 ,14 ,15,
> 16, 17 ,18 ,19 ,2 ,3 ,4,...
> Question1:I would rather have them in the natural order: 1,2,3,4.....Is
> there an easy way to change the plot order? I saw at the CNA step on the
> manual(DNAcopy) there is a mention of natural order but I don't quite
> understand what needs to be done:see below
> Usage
> CNA(genomdat, chrom, maploc, data.type=c("logratio","binary"),
> sampleid=NULL)
> ## S3 method for class 'CNA':
> print(x, ...)
> Arguments
> a vector or matrix of data from array-CGH, ROMA, or other copy number ex-
> genomdat
> periments. If it is a matrix the rows correspond to the markers and the
> columns
> to the samples.
> the chromosomes (or other group identifier) from which the markers came.
> Vec-
> chrom
> tor of length same as the number of rows of genomdat. If one wants the
> chro-
> mosomes to be ordered in the natural order, this variable should be
> numeric or
> ordered category.

In my example, I see that

   chrom <- rep(c("chr11", "chr2"), c(290,210))
   plot(segment(CNA(genomdat, chrom, maploc)), plot.type="whole")

gives me the wrong order (because "chr11" comes before "chr2" if I 
'sort' these).

I suspect the author of the package intended for something different, 
but if I create a map between my chromosome names and the order in which 
I'd like them to appear...

   map <- 1:24
   names(map) <- paste("chr", c(1:22, "X", "Y"), sep="")

and then recode my chromosome names to their integer representation

   chrom <- map[rep(c("chr11", "chr2"), c(290,210))]
   plot(segment(CNA(genomdat, chrom, maploc)), plot.type="whole")

everyone is happy. I think the author intended that it would possible to 
create a (ordered?) factor, where the levels indicate the sort order.

   chrom <- factor(rep(c("chr11", "chr2"), c(290,210)),
                   levels=paste("chr", c(1:22, "X", "Y"), sep=""))

but somewhere things are going awry.

> the locations of marker on the genome. Vector of length same as the
> number of
> maploc
> rows of genomdat. This has to be numeric.
> logratio (aCGH, ROMA, etc.) or binary (LOH).
> data.type
> sample identifier. If missing the samples are named by prefixing
> "Sample" to
> sampleid
> consecutive integers.
> object returned by CNA
> x
> arguments to be passed onto print command called within.
> ...
>
>
> Question2: Is it possible at this point to plot the probe using a colour
> that will correspond to their log2R, example if a probe log2R is more
> than 0.2 plot it in red using a script?

I don't think so using plot.DNAcopy. But in some ways a little 
creativity goes a long way (this is probably too much for an answer, but 
maybe inspires...)

   library(RColorBrewer)  # nice palettes; see http://colorbrewer2.org/
   library(lattice)
   pal <- brewer.pal(9, "OrRd")[-1]      # drop lightest color
   breaks <- with(seg$data, {
       seq(min(Sample.1), max(Sample.1), length.out=length(pal) + 1)
   })

   xyplot(Sample.1 ~ maploc | as.factor(chrom), seg$data,
          panel=function(x, y, ..., subscripts, segments, pal, breaks) {
              col <- pal[cut(y, breaks=breaks, labels=FALSE)]
              panel.xyplot(x, y, ..., col=col)
              currChrom <- unique(seg$data$chrom[subscripts])
              segs <- segments[as.factor(segments$chrom) == currChrom,,
                               drop=FALSE]
              for (i in seq_len(nrow(segs)))
                  with(segs[i,,drop=FALSE], {
                      llines(c(loc.start, loc.end), seg.mean,
                             col="black")
                  })
          },
          segments=seg$output,
          pch=20, pal=pal, breaks=breaks,
          layout=c(1, 2),
          strip=FALSE, strip.left=TRUE)

Martin

>
> Many thanks in advance for any help
> Nathalie
>
>
>
>
>
>

-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793