[BioC] 8. CGHCall problems (Bernard North [guest]) : Bioconductor Digest, Vol 126, Issue 7

Mon Sep 9 15:14:35 CEST 2013

Dear Bernard,  regarding the last part of your question, it might help you to know about packages CGHtest (http://www.few.vu.nl/~mavdwiel/CGHtest.html)  or CNVtools (http://www.bioconductor.org/packages/release/bioc/html/CNVtools.html) that can both be used for looking at genetic association of calls.

Date: Tue,  6 Aug 2013 09:12:05 -0700 (PDT)
From: "Bernard North [guest]" <guest at bioconductor.org>
To: bioconductor at r-project.org, b.v.north at qmul.ac.uk
Cc: CGHcall Maintainer <mark.vdwiel at vumc.nl>
Subject: [BioC] CGHCall problems
Message-ID: <20130806161205.C269B143590 at mamba.fhcrc.org>

Dear All,

I am using CGHcall to segment and call aCGH copy number data.
My understanding is that the segmentation step of CGHcall is the same CBS method used in DNAcopy.
CGHcall has a function called "calls" which has segments as rows (defined as start probe to end probe) and columns for each sample with the elements being calls.
Given that DNAcopy has a different segmentation for each sample how is the segmentation in allcalls decided upon ?
Calls is run as allcalls<-data.frame(calls(result)) where result is the final CGHCall object as per the vignette

Also does CGHcall provide pvalues or qvalues to test if any regions are recurrently amplified or deleted over samples ?

-----Original Message-----
From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-project.org] On Behalf Of bioconductor-request at r-project.org
Sent: 07 August 2013 11:00
To: bioconductor at r-project.org
Subject: Bioconductor Digest, Vol 126, Issue 7

Send Bioconductor mailing list submissions to
        bioconductor at r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://stat.ethz.ch/mailman/listinfo/bioconductor
or, via email, send a message with subject or body 'help' to
        bioconductor-request at r-project.org

You can reach the person managing the list at
        bioconductor-owner at r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioconductor digest..."

Today's Topics:

   1. Limma\'s roast() does not accept weigths in combination with
      block (Gordon K Smyth)
   2. Re: XCMS query regarding mzXML (Reema Singh)
   3.  HTqPCR problem with ttestCtData function (Ruben Dries)
   4. Re: XCMS query regarding mzXML (Laurent Gatto)
   5. Re: request (Wolfgang Huber)
   6. Re: RNASeq:- getting Zero Count (Valerie Obenchain)
   7. Extracting overlapping gene names from a list of peaks
      (Patrick Schorderet)
   8. CGHCall problems (Bernard North [guest])
   9. Re: Extracting overlapping gene names from a list of peaks
      (James W. MacDonald)
  10. Re: request (Alexey Moskalev)
  11. Re: DNAStringSetList can't coerce a list? (Taylor, Sean D)
  12. fRMA package (Li Liu)
  13. Re: ggbio: Data stored twice in 'GGbio' object (Michael Lawrence)
  14. Re: fRMA package (Dan Tenenbaum)
  15. Re: request (Laurent Gatto)
  16. Re: request (Alexey Moskalev)
  17. Re: request (Wolfgang Huber)
  18. Re: request (Steve Lianoglou)
  19. Re: ggbio facet_gr example sought (Michael Lawrence)
  20. Re: ggbio: Data stored twice in 'GGbio' object (Julian Gehring)
  21. Re: fRMA package (Li Liu)
  22. Re: fRMA package (Dan Tenenbaum)
  23. Re: Extracting overlapping gene names from a list of peaks
      (Michael Lawrence)
  24. Re: fRMA package (Li Liu)
  25. Re: fRMA package (Dan Tenenbaum)
  26. Re: ggbio facet_gr example sought (Tengfei Yin)
  27. ??:  an error in AnnotationForge (joseph)
  28. beadarray library: perBeadFile (Nogales Vilardell)
  29. ??:  an error in AnnotationForge (joseph)
  30. topGO question (Datong Wang)
  31. Re: ggbio facet_gr example sought (Cook, Malcolm)
  32. basic query to make groups .. (ALok)

----------------------------------------------------------------------

Message: 1
Date: Tue, 6 Aug 2013 21:17:28 +1000 (AUS Eastern Standard Time)
From: Gordon K Smyth <smyth at wehi.EDU.AU>
To: ssehztirom at gmail.com
Cc: Bioconductor mailing list <bioconductor at r-project.org>
Subject: [BioC] Limma\'s roast() does not accept weigths in
        combination with        block
Message-ID: <Pine.WNT.4.64.1308062109130.7100 at PC975.wehi.edu.au>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

> Date: Mon,  5 Aug 2013 03:36:49 -0700 (PDT)
> From: "Moritz Hess [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, ssehztirom at gmail.com
> Subject: [BioC] Limma\'s roast() does not accept weigths in
>       combination with        block
>
> Dear All,
>
> I am conducting enrichment tests using Limma's roast() function.  As I
> am investigating RNA-Seq data, I have to introduce the weights
> calculated by voom(). Without a blocking variable, roast() runs without
> an itch but when I introduce a blocking variable (with or without
> correlation within blocks), roast() halts and returns "Can't use block
> with weights". Is the combination of weights and blocking variables in
> roast() generally impossible

Not impossible, but requires some careful special case programming.

> and if not, will it be possible in upcomming releases of limma?

Yes, but not in the next few weeks.

Gordon

> Thank you very much in advance,
>
> Moritz
>
> -- output of sessionInfo():
>
> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8
> [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=de_DE.UTF-8
> [7] LC_PAPER=C                 LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  splines   stats     graphics  grDevices utils     datasets
> [8] methods   base
>
> other attached packages:
> [1] DESeq_1.12.0       lattice_0.20-15    locfit_1.5-9.1     Biobase_2.20.0
> [5] BiocGenerics_0.6.0 statmod_1.4.17     edgeR_3.2.3        limma_3.16.5
>
> loaded via a namespace (and not attached):
> [1] annotate_1.38.0      AnnotationDbi_1.22.6 compiler_3.0.1
> [4] DBI_0.2-7            genefilter_1.42.0    geneplotter_1.38.0
> [7] grid_3.0.1           IRanges_1.18.1       RColorBrewer_1.0-5
> [10] RSQLite_0.11.4       stats4_3.0.1         survival_2.37-4
> [13] tools_3.0.1          XML_3.98-1.1         xtable_1.7-1
>
>
> --

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}

------------------------------

Message: 2
Date: Tue, 6 Aug 2013 18:42:26 +0530
From: Reema Singh <reema28sep at gmail.com>
To: Laurent Gatto <lg390 at cam.ac.uk>
Cc: bioconductor <bioconductor at r-project.org>
Subject: Re: [BioC] XCMS query regarding mzXML
Message-ID:
        <CAEHmZ4tq0SXrhud-DW6nGMy-LOZaSf0c491U_oMQCM1Z2aqO4g at mail.gmail.com>
Content-Type: text/plain

Dear Laurent,

Thank you for your reply.

It' sworking fine with "PAe000002_mzXML_201106211454.tar.gz" this dataset
on my machine. But when I have used "PAe000030_mzXML_201104131929.tar.gz",
I got error. Here's the complete command and sessioninfo.

library(xcms)

files <- list.files("PAe000030",recursive=TRUE,full.names=TRUE)
files
 [1] "PAe000030/hui_serum10_full.mzXML"
 [2] "PAe000030/hui_serum16_full.mzXML"
 [3] "PAe000030/hui_serum17_full.mzXML"
 [4] "PAe000030/hui_serum18_full.mzXML"
 .
> xr<-xcmsRaw(files[1])
Warning message:
In `profStep<-`(`*tmp*`, value = 1) :
  MS1 scans empty. Skipping profile matrix calculation.
> xr<-xcmsRaw(files)
Error in file(con, "rb") : invalid 'description' argument
In addition: Warning message:
In if (!file.exists(filename)) return(FALSE) :
  the condition has length > 1 and only the first element will be used

> xr <- xcmsSet(files)
Error in x[1]:x[2] : NA/NaN argument

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=C                LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] xcms_1.36.0        Biobase_2.20.1     BiocGenerics_0.6.0 mzR_1.6.2

[5] Rcpp_0.10.4

loaded via a namespace (and not attached):
[1] codetools_0.2-8

Kind Regards

On Tue, Aug 6, 2013 at 3:23 PM, Laurent Gatto <lg390 at cam.ac.uk> wrote:

> Dear Reema,
>
> On 6 August 2013 10:22, Reema Singh <reema28sep at gmail.com> wrote:
> > Dear All,
> >
> > I am trying to import .mzXML files using XCMS package. I have tried it
> with
> > two different data set (
> > ftp://ftp.peptideatlas.org/pub/PeptideAtlas/Repository/PAe000030 ) and
> > ftp://ftp.peptideatlas.org/pub/PeptideAtlas/Repository/PAe0000<
> ftp://ftp.peptideatlas.org/pub/PeptideAtlas/Repository/PAe000030>02).
> >  After extracting .mzXML files, when i tried to import them using XCMS, I
> > got this output.
> > *PAe000002*
> > files <- list.files("TEST", recursive=TRUE,full.names=TRUE)
> >> xr<-xcmsRaw(files[1])
> >> xr
> > An "xcmsRaw" object with 2070 mass spectra
> >
> > Time range: 120-5879.1 seconds (2-98 minutes)
> > Mass range: 400.0667-1399.9995 m/z
> > Intensity range: 1-465033000
> >
> > MSn data on  0  mass(es)
> > with  0  MSn spectra
> > Profile method: bin
> > Profile step: 1 m/z (1001 grid points from 400 to 1400 m/z)
> >
> > Memory usage: 34.4 MB
> >
> > *PAe000030*
> >
> >> files1 <- list.files("TEST1", recursive=TRUE,full.names=TRUE)
> >> xr1<-xcmsRaw(files1[1])
> > Warning message:
> > In `profStep<-`(`*tmp*`, value = 1) :
> >   MS1 scans empty. Skipping profile matrix calculation.
> >> xr1
> > An "xcmsRaw" object with 0 mass spectra
> >
> > MSn data on  0  mass(es)
> > with  0  MSn spectra
> > Profile method: bin
> > Profile step: no profile data
> >
> > Memory usage: 0.00481 MB
> >>
> >
> > Now My question is Why One dataset is successfulyy imported, whereas in
> the
> > same dataset got some warnings and datset with zero masses?.
> >
> > I would appreciate any help.
>
> First, we do not know exactly what files you are using for your test.
> Reading all of the PAe000030 mzXML files works well on my computer,
> indicating that it is likely not a mzXML issue as such.
>
> As the warning message suggests, the MS1 scans of that particular
> mzXML file are empty, which terminates the processing. Have you had
> more luck with another file from that experiment? You might want to
> check the offending mzXML file - it might indeed be valid yet 'empty'.
>
> Hope this helps,
>
> Laurent
>
> > Kind Regards
> >
> >
> > --
> > Reema Singh
> > PhD Scholar
> > Computational Biology and Bioinformatics
> > School of Computational and Integrative Sciences
> > Jawaharlal Nehru University
> > New Delhi-110067
> > INDIA
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Laurent Gatto
> - http://proteome.sysbiol.cam.ac.uk/lgatto/
> Cambridge Centre for Proteomics
> - http://www.bio.cam.ac.uk/proteomics
> Using R/Bioconductor for proteomics data analysis
> - http://lgatto.github.io/RforProteomics/
>

--
Reema Singh
PhD Scholar
Computational Biology and Bioinformatics
School of Computational and Integrative Sciences
Jawaharlal Nehru University
New Delhi-110067
INDIA

        [[alternative HTML version deleted]]

------------------------------

Message: 3
Date: Tue, 6 Aug 2013 15:49:22 +0200
From: Ruben Dries <rubendries at gmail.com>
To: bioconductor at r-project.org
Subject: [BioC]  HTqPCR problem with ttestCtData function
Message-ID: <478085E4-536C-411A-A114-F55FA895A72E at gmail.com>
Content-Type: text/plain

Dear,

I'm having a problem with the ttestCtData function from the HTqPCR package, which I use to analyze my BioMark Fluidigm data.

In most cases there is no problem:

> qDE.ttest.Nanog <- ttestCtData(q.norm[,c(1:9)], groups = conditions[c(1:9)], calibrator = "R-L_KD", stringent = FALSE)
> head(qDE.ttest.Nanog, n=2)
     genes feature.pos     t.test      p.value adj.p.value       ddCt        FC meanCalibrator meanTarget categoryCalibrator categoryTarget
40   Nanog   feature30 -16.290155 2.477252e-05 0.002204754  1.2277206 0.4269915       15.50882   16.73654                 OK             OK
84 Zcchc12   feature71   6.271628 4.370851e-04 0.019450286 -0.5513761 1.4654829       16.03707   15.48569                 OK             OK

However sometimes I get this error

> write.table(qDE.ttest.Nanog, file = "/Users/ruben/Dropbox/Data/qPCR/results/BioMark/Fluidigm4/ND2_fluid4/Ttest/ND2_ttest_Nanog.txt")
> qDE.ttest.Rest <- ttestCtData(q.norm[,c(1:5,10:13)], groups = conditions[c(1:5,10:13)], calibrator = "R-L_KD", stringent = FALSE)
Error in t.test.default(x[, g1], x[, g2], alternative = alternative, paired = paired,  :
  data are essentially constant

However when I change the normalization from quantile normalization (q.norm) to for example norm.rankinvariant (nr.norm) this error doesn't occur anymore.

All of the samples are different, I compare 4 biological target replicates to 4 from the control. Could it be due to the quantile normalization? And would it be ok if I used the norm.rankinvariant normalization if I encounter this error?

Best regards,

Ruben
        [[alternative HTML version deleted]]

------------------------------

Message: 4
Date: Tue, 6 Aug 2013 15:28:17 +0100
From: Laurent Gatto <lg390 at cam.ac.uk>
To: Reema Singh <reema28sep at gmail.com>
Cc: bioconductor <bioconductor at r-project.org>
Subject: Re: [BioC] XCMS query regarding mzXML
Message-ID:
        <CA+uNOzhpp0347VKi1u+0ymZBUnStKnyt=nyrQKfM+7-6Tddg3w at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On 6 August 2013 14:12, Reema Singh <reema28sep at gmail.com> wrote:
> Dear Laurent,
>
> Thank you for your reply.
>
> It' sworking fine with "PAe000002_mzXML_201106211454.tar.gz" this dataset on
> my machine. But when I have used "PAe000030_mzXML_201104131929.tar.gz", I
> got error. Here's the complete command and sessioninfo.
>
> library(xcms)
>
> files <- list.files("PAe000030",recursive=TRUE,full.names=TRUE)
> files
>  [1] "PAe000030/hui_serum10_full.mzXML"
>  [2] "PAe000030/hui_serum16_full.mzXML"
>  [3] "PAe000030/hui_serum17_full.mzXML"
>  [4] "PAe000030/hui_serum18_full.mzXML"
>  .
>> xr<-xcmsRaw(files[1])
> Warning message:
> In `profStep<-`(`*tmp*`, value = 1) :
>   MS1 scans empty. Skipping profile matrix calculation.

Investigating the content of the files sheds some light on the source
of the error. None of the 64 files has any MS1 spectra - they only
contain MS2 spectra. The observed result seems thus to be correct.

>> xr<-xcmsRaw(files)
> Error in file(con, "rb") : invalid 'description' argument
> In addition: Warning message:
> In if (!file.exists(filename)) return(FALSE) :
>   the condition has length > 1 and only the first element will be used

Based on ?xcmsRaw, this is not supposed to work - xcmsRaw that a
single file as input, as stated by the warning.

>> xr <- xcmsSet(files)
> Error in x[1]:x[2] : NA/NaN argument

Same explanation as above, I assume.

Hope this helps.

Best wishes,

Laurent

>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>  [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
>  [7] LC_PAPER=C                LC_NAME=C
>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
> [1] xcms_1.36.0        Biobase_2.20.1     BiocGenerics_0.6.0 mzR_1.6.2
> [5] Rcpp_0.10.4
>
> loaded via a namespace (and not attached):
> [1] codetools_0.2-8
>
> Kind Regards
>
>
>
> On Tue, Aug 6, 2013 at 3:23 PM, Laurent Gatto <lg390 at cam.ac.uk> wrote:
>>
>> Dear Reema,
>>
>> On 6 August 2013 10:22, Reema Singh <reema28sep at gmail.com> wrote:
>> > Dear All,
>> >
>> > I am trying to import .mzXML files using XCMS package. I have tried it
>> > with
>> > two different data set (
>> > ftp://ftp.peptideatlas.org/pub/PeptideAtlas/Repository/PAe000030 ) and
>> >
>> > ftp://ftp.peptideatlas.org/pub/PeptideAtlas/Repository/PAe0000<ftp://ftp.peptideatlas.org/pub/PeptideAtlas/Repository/PAe000030>02).
>> >  After extracting .mzXML files, when i tried to import them using XCMS,
>> > I
>> > got this output.
>> > *PAe000002*
>> > files <- list.files("TEST", recursive=TRUE,full.names=TRUE)
>> >> xr<-xcmsRaw(files[1])
>> >> xr
>> > An "xcmsRaw" object with 2070 mass spectra
>> >
>> > Time range: 120-5879.1 seconds (2-98 minutes)
>> > Mass range: 400.0667-1399.9995 m/z
>> > Intensity range: 1-465033000
>> >
>> > MSn data on  0  mass(es)
>> > with  0  MSn spectra
>> > Profile method: bin
>> > Profile step: 1 m/z (1001 grid points from 400 to 1400 m/z)
>> >
>> > Memory usage: 34.4 MB
>> >
>> > *PAe000030*
>> >
>> >> files1 <- list.files("TEST1", recursive=TRUE,full.names=TRUE)
>> >> xr1<-xcmsRaw(files1[1])
>> > Warning message:
>> > In `profStep<-`(`*tmp*`, value = 1) :
>> >   MS1 scans empty. Skipping profile matrix calculation.
>> >> xr1
>> > An "xcmsRaw" object with 0 mass spectra
>> >
>> > MSn data on  0  mass(es)
>> > with  0  MSn spectra
>> > Profile method: bin
>> > Profile step: no profile data
>> >
>> > Memory usage: 0.00481 MB
>> >>
>> >
>> > Now My question is Why One dataset is successfulyy imported, whereas in
>> > the
>> > same dataset got some warnings and datset with zero masses?.
>> >
>> > I would appreciate any help.
>>
>> First, we do not know exactly what files you are using for your test.
>> Reading all of the PAe000030 mzXML files works well on my computer,
>> indicating that it is likely not a mzXML issue as such.
>>
>> As the warning message suggests, the MS1 scans of that particular
>> mzXML file are empty, which terminates the processing. Have you had
>> more luck with another file from that experiment? You might want to
>> check the offending mzXML file - it might indeed be valid yet 'empty'.
>>
>> Hope this helps,
>>
>> Laurent
>>
>> > Kind Regards
>> >
>> >
>> > --
>> > Reema Singh
>> > PhD Scholar
>> > Computational Biology and Bioinformatics
>> > School of Computational and Integrative Sciences
>> > Jawaharlal Nehru University
>> > New Delhi-110067
>> > INDIA
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> --
>> Laurent Gatto
>> - http://proteome.sysbiol.cam.ac.uk/lgatto/
>> Cambridge Centre for Proteomics
>> - http://www.bio.cam.ac.uk/proteomics
>> Using R/Bioconductor for proteomics data analysis
>> - http://lgatto.github.io/RforProteomics/
>
>
>
>
> --
> Reema Singh
> PhD Scholar
> Computational Biology and Bioinformatics
> School of Computational and Integrative Sciences
> Jawaharlal Nehru University
> New Delhi-110067
> INDIA

--
Laurent Gatto
- http://proteome.sysbiol.cam.ac.uk/lgatto/
Cambridge Centre for Proteomics
- http://www.bio.cam.ac.uk/proteomics
Using R/Bioconductor for proteomics data analysis
- http://lgatto.github.io/RforProteomics/

------------------------------

Message: 5
Date: Tue, 6 Aug 2013 17:32:18 +0200
From: Wolfgang Huber <whuber at embl.de>
To: Alexey Moskalev <amoskalev at list.ru>
Cc: bioconductor at r-project.org
Subject: Re: [BioC] request
Message-ID: <D2F657C9-BC8F-4CCE-A176-755C7EADA9B9 at embl.de>
Content-Type: text/plain; charset=us-ascii

Dear Alexey
please type "plotPCA" into the R command line to see how the function computes the PCA, then have a look at the manual page of the functions "prcomp" and "screeplot".

@all: I am not sure what would be a good user interface would be for modifying the "plotPCA" function so that it can return the 'pca' object for user inspection (such as desired by Alexey); currently it returns the 'trelliis' object as its return value.

Best wishes
        Wolfgang

On 6 Aug 2013, at 08:54, Alexey Moskalev <amoskalev at list.ru> wrote:

> I am using DeSeq package to produce Principal components biplot on variance stabilized data for my RNASeq data. I was wondering if you advice me how to know Proportion of Variance for the first and the second Principal components using DeSeq?
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

------------------------------

Message: 6
Date: Tue, 06 Aug 2013 08:43:43 -0700
From: Valerie Obenchain <vobencha at fhcrc.org>
To: Reema Singh <reema28sep at gmail.com>
Cc: bioconductor <bioconductor at r-project.org>
Subject: Re: [BioC] RNASeq:- getting Zero Count
Message-ID: <520119AF.1070201 at fhcrc.org>
Content-Type: text/plain; charset=windows-1252; format=flowed

I would do some investigating with a single bam file.

Confirm 'gnModel' and 'aln' have some common seqlevels. This call should
produce a result.

     intersect(seqlevels(aln), seqlevels(gnModel))

Call countOverlaps on a single file.

     co <- countOverlaps(aln, gnModel)

Evidently you only want the bam records that have exactly 5 hits. This
could be limiting. To see the distribution of hits make a table of the
counts.

     table(co)

Valerie

On 08/05/2013 10:05 PM, Reema Singh wrote:
> Hi Valerie,
>
> Thank you so much for the reply.
>
> After checking the seqlevels, I am able to get rid off the error, but
> still getting the zero count entries. Is there any another way of doing
> this?
>
> KInd Regards
>
>
> On Mon, Aug 5, 2013 at 9:21 PM, Valerie Obenchain <vobencha at fhcrc.org
> <mailto:vobencha at fhcrc.org>> wrote:
>
>     Hi Reema,
>
>     To perform overlap or matching operations the seqlevels (chromosome
>     names) of the objects must match. The error message is telling you
>     that some of these do not match. It's reasonable that a few names
>     may not match (maybe a chromosome is present in one object and not
>     the other) but the majority should.
>
>     Check the seqlevels:
>     seqlevels(aln)
>     seqlevels(gnModel)
>
>     Which names are common to both:
>     intersect(seqlevels(gnModel), seqlevels(aln))
>
>       You can rename seqlevels in several different ways. See
>     ?renameSeqlevels or ?seqlevels for examples.
>
>     Valerie
>
>
>     On 08/04/2013 06:35 AM, Reema Singh wrote:
>
>         Dear All,
>
>         I am trying to extract the read count from three .bam files. But
>         I am
>         getting Zero count entries.
>
>         I am using Mycobacterium Tuberculosis H37Rv gtf file (
>         ftp://ftp.ensemblgenomes.org/__pub/release-19/bacteria//gtf/__bacteria_1_collection/__mycobacterium_tuberculosis___h37rv/Mycobacterium___tuberculosis_h37rv.GCA___000277735.1.19.gtf.gz
>         <ftp://ftp.ensemblgenomes.org/pub/release-19/bacteria//gtf/bacteria_1_collection/mycobacterium_tuberculosis_h37rv/Mycobacterium_tuberculosis_h37rv.GCA_000277735.1.19.gtf.gz>)
>         and RNASeq data used here were downloaded from (
>         http://www.ncbi.nlm.nih.gov/__geo/query/acc.cgi?acc=GSE40846
>         <http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE40846>__)
>         and aligned
>         with bowtie2.
>
>
>         library(GenomicFeatures)
>         txdb <-
>         makeTranscriptDbFromGFF(file="__Mycobacterium_tuberculosis___h37rv.GCA_000277735.1.19.gtf",__format="gtf")
>         saveDb(txdb,file="__MycoTubeH37Rv.sqlite")
>         load("MycoTubeH37Rv.sqlite")
>         gnModel <- exonsBy(txdb,"gene") ### *also tried with
>         "transcripts", "cds",
>         but getting same *
>
>
>         bamFiles <- list.files(".", "bam$", full=TRUE)
>         names(bamFiles) <- sub("\\..*","",basename(__bamFiles))
>         counter <- function(fl, gnModel){
>         aln <- GenomicRanges::__readGappedAlignments(fl)
>         strand(aln)
>         hits <- countOverlaps(aln,gnModel)
>         counts <- countOverlaps(gnModel,aln[__hits==5])
>         names(counts) <- names(gnModel)
>         counts
>         }
>
>         counts <- sapply(bamFiles,counter,__gnModel)
>
>         Note: method with signature ?Vector#GRangesList? chosen for function
>         ?countOverlaps?,
>            target signature ?GappedAlignments#GRangesList?__.
>            "GappedAlignments#Vector" would also be valid
>         Note: method with signature ?GRangesList#Vector? chosen for function
>         ?countOverlaps?,
>            target signature ?GRangesList#GappedAlignments?__.
>            "Vector#GappedAlignments" would also be valid
>         Warning messages:
>         1: In .Seqinfo.mergexy(x, y) :
>             Each of the 2 combined objects has sequence levels not in
>         the other:
>             - in 'x': gi|448814763|ref|NC_000962.3|
>             - in 'y': Chromosome
>             Make sure to always combine/compare objects based on the
>         same reference
>             genome (use suppressWarnings() to suppress this warning).
>         2: In .Seqinfo.mergexy(x, y) :
>             Each of the 2 combined objects has sequence levels not in
>         the other:
>             - in 'x': Chromosome
>             - in 'y': gi|448814763|ref|NC_000962.3|
>             Make sure to always combine/compare objects based on the
>         same reference
>             genome (use suppressWarnings() to suppress this warning).
>         3: In .Seqinfo.mergexy(x, y) :
>             Each of the 2 combined objects has sequence levels not in
>         the other:
>             - in 'x': gi|448814763|ref|NC_000962.3|
>             - in 'y': Chromosome
>             Make sure to always combine/compare objects based on the
>         same reference
>             genome (use suppressWarnings() to suppress this warning).
>         4: In .Seqinfo.mergexy(x, y) :
>             Each of the 2 combined objects has sequence levels not in
>         the other:
>             - in 'x': Chromosome
>             - in 'y': gi|448814763|ref|NC_000962.3|
>             Make sure to always combine/compare objects based on the
>         same reference
>             genome (use suppressWarnings() to suppress this warning).
>         5: In .Seqinfo.mergexy(x, y) :
>             Each of the 2 combined objects has sequence levels not in
>         the other:
>             - in 'x': gi|448814763|ref|NC_000962.3|
>             - in 'y': Chromosome
>             Make sure to always combine/compare objects based on the
>         same reference
>             genome (use suppressWarnings() to suppress this warning).
>         6: In .Seqinfo.mergexy(x, y) :
>             Each of the 2 combined objects has sequence levels not in
>         the other:
>             - in 'x': Chromosome
>             - in 'y': gi|448814763|ref|NC_000962.3|
>             Make sure to always combine/compare objects based on the
>         same reference
>             genome (use suppressWarnings() to suppress this warning).
>
>         head(counts)
>
>                     SRR568038 SRR568039 SRR568040
>         RVBD_0001         0         0         0
>         RVBD_0002         0         0         0
>         RVBD_0003         0         0         0
>         RVBD_0004         0         0         0
>         RVBD_0005         0         0         0
>         RVBD_0006         0         0         0
>
>             sessionInfo()
>
>         R version 3.0.1 (2013-05-16)
>         Platform: x86_64-redhat-linux-gnu (64-bit)
>
>         locale:
>            [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>            [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>            [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8
>            [7] LC_PAPER=C                LC_NAME=C
>            [9] LC_ADDRESS=C              LC_TELEPHONE=C
>         [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
>         attached base packages:
>         [1] parallel  stats     graphics  grDevices utils     datasets
>           methods
>         [8] base
>
>         other attached packages:
>         [1] Rsamtools_1.12.3       Biostrings_2.28.0
>           GenomicFeatures_1.12.3
>         [4] AnnotationDbi_1.22.6   Biobase_2.20.1
>         GenomicRanges_1.12.4
>         [7] IRanges_1.18.2         BiocGenerics_0.6.0
>
>         loaded via a namespace (and not attached):
>            [1] biomaRt_2.16.0     bitops_1.0-5       BSgenome_1.28.0
>           DBI_0.2-6
>
>            [5] RCurl_1.95-4.1     RSQLite_0.11.3     rtracklayer_1.20.4
>         stats4_3.0.1
>
>            [9] tools_3.0.1        XML_3.96-1.1       zlibbioc_1.6.0
>
>
>         Kind regards
>
>
>
>
>         _________________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/__listinfo/bioconductor
>         <https://stat.ethz.ch/mailman/listinfo/bioconductor>
>         Search the archives:
>         http://news.gmane.org/gmane.__science.biology.informatics.__conductor
>         <http://news.gmane.org/gmane.science.biology.informatics.conductor>
>
>
>
>
> --
> Reema Singh
> PhD Scholar
> Computational Biology and Bioinformatics
> School of Computational and Integrative Sciences
> Jawaharlal Nehru University
> New Delhi-110067
> INDIA

------------------------------

Message: 7
Date: Tue, 6 Aug 2013 11:48:57 -0400
From: Patrick Schorderet <patrick.schorderet at molbio.mgh.harvard.edu>
To: bioconductor at r-project.org
Subject: [BioC] Extracting overlapping gene names from a list of peaks
Message-ID:
        <56CB5B3C-0D88-4A16-8D4D-3A3B9F0ED3FD at molbio.mgh.harvard.edu>
Content-Type: text/plain

Dear all,

I have a list of peaks from ChIPseq experiments. Now I am trying to find to over which genes these peaks overlap (and extract the gene name).
I'm sure this should be pretty easy, but I am just starting with bioconductor, so some concepts are still vague.
Here's what I did so far:

# Not run because it is installed (as well as other packages)
# source("http://bioconductor.org/biocLite.R")
# biocLite("TxDb.Dmelanogaster.UCSC.dm3.ensGene")
# Load the Dmelanogaster genome
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene, quietly = TRUE)
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
ee <- exonsBy(txdb, "gene")

# Load an subset of my peaks for the sake of the example
finalPeaks <- rbind(c("chr3R", "2788500", "2842850", "2815675", "54350"), (c("chr3R", "12484350", "12661350", "12572850", "177000")))
rownames(finalPeaks) <- c("Peak 1", "Peak 2")
colnames(finalPeaks) <- c("chr", "start", "end", "center", "length")

# Create a GRanges object with the file I have
# finalPeaks is a matrix with rows being individual peaks and columns <- c()
GRfinalPeaks <- GRanges(finalPeaks[,1], IRanges(start = as.numeric(finalPeaks[,2]), end = as.numeric(finalPeaks[,3])))

I'm stuck from here. What I'd like is to get, for each peak, the overlapping genes.
For example, an output that would be:

Peak1: GeneA, GeneB, GeneC, etc
Peak2: GeneD

Also, I don't know how simple it is because (as specified in the output example) one peak can overlap several genes or none at all..
Thanks for any help,

Patrick
        [[alternative HTML version deleted]]

------------------------------

Message: 8
Date: Tue,  6 Aug 2013 09:12:05 -0700 (PDT)
From: "Bernard North [guest]" <guest at bioconductor.org>
To: bioconductor at r-project.org, b.v.north at qmul.ac.uk
Cc: CGHcall Maintainer <mark.vdwiel at vumc.nl>
Subject: [BioC] CGHCall problems
Message-ID: <20130806161205.C269B143590 at mamba.fhcrc.org>

Dear All,

I am using CGHcall to segment and call aCGH copy number data.
My understanding is that the segmentation step of CGHcall is the same CBS method used in DNAcopy.
CGHcall has a function called "calls" which has segments as rows (defined as start probe to end probe) and columns for each sample with the elements being calls.
Given that DNAcopy has a different segmentation for each sample how is the segmentation in allcalls decided upon ?
Calls is run as allcalls<-data.frame(calls(result)) where result is the final CGHCall object as per the vignette

Also does CGHcall provide pvalues or qvalues to test if any regions are recurrently amplified or deleted over samples ?

 -- output of sessionInfo():

> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] CGHcall_2.18.0     snowfall_1.84-4    snow_0.3-12        CGHbase_1.18.0
 [5] marray_1.36.0      impute_1.32.0      GEOquery_2.24.1    Biobase_2.18.0
 [9] BiocGenerics_0.4.0 snapCGH_1.28.0     limma_3.14.4       DNAcopy_1.32.0

loaded via a namespace (and not attached):
 [1] aCGH_1.36.0           affy_1.36.1           affyio_1.26.0
 [4] annotate_1.36.0       AnnotationDbi_1.20.7  BiocInstaller_1.8.3
 [7] cluster_1.14.3        DBI_0.2-7             genefilter_1.40.0
[10] GLAD_2.20.0           grid_2.15.2           IRanges_1.16.6
[13] lattice_0.20-10       MASS_7.3-22           multtest_2.14.0
[16] parallel_2.15.2       pixmap_0.4-11         preprocessCore_1.20.0
[19] RColorBrewer_1.0-5    RCurl_1.95-4.1        RSQLite_0.11.4
[22] splines_2.15.2        stats4_2.15.2         strucchange_1.4-7
[25] survival_2.36-14      tilingArray_1.36.0    vsn_3.26.0
[28] XML_3.98-1.1          xtable_1.7-1          zlibbioc_1.4.0
>

--
Sent via the guest posting facility at bioconductor.org.

------------------------------

Message: 9
Date: Tue, 06 Aug 2013 12:14:49 -0400
From: "James W. MacDonald" <jmacdon at uw.edu>
To: Patrick Schorderet <patrick.schorderet at molbio.mgh.harvard.edu>
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Extracting overlapping gene names from a list of
        peaks
Message-ID: <520120F9.50600 at uw.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi Patrick,

On 8/6/2013 11:48 AM, Patrick Schorderet wrote:
> Dear all,
>
> I have a list of peaks from ChIPseq experiments. Now I am trying to find to over which genes these peaks overlap (and extract the gene name).
> I'm sure this should be pretty easy, but I am just starting with bioconductor, so some concepts are still vague.
> Here's what I did so far:
>
> # Not run because it is installed (as well as other packages)
> # source("http://bioconductor.org/biocLite.R")
> # biocLite("TxDb.Dmelanogaster.UCSC.dm3.ensGene")
> # Load the Dmelanogaster genome
> library(TxDb.Dmelanogaster.UCSC.dm3.ensGene, quietly = TRUE)
> txdb<- TxDb.Dmelanogaster.UCSC.dm3.ensGene
> ee<- exonsBy(txdb, "gene")
>
> # Load an subset of my peaks for the sake of the example
> finalPeaks<- rbind(c("chr3R", "2788500", "2842850", "2815675", "54350"), (c("chr3R", "12484350", "12661350", "12572850", "177000")))
> rownames(finalPeaks)<- c("Peak 1", "Peak 2")
> colnames(finalPeaks)<- c("chr", "start", "end", "center", "length")
>
> # Create a GRanges object with the file I have
> # finalPeaks is a matrix with rows being individual peaks and columns<- c()
> GRfinalPeaks<- GRanges(finalPeaks[,1], IRanges(start = as.numeric(finalPeaks[,2]), end = as.numeric(finalPeaks[,3])))
>
> I'm stuck from here. What I'd like is to get, for each peak, the overlapping genes.
> For example, an output that would be:
>
> Peak1: GeneA, GeneB, GeneC, etc
> Peak2: GeneD
>
> Also, I don't know how simple it is because (as specified in the output example) one peak can overlap several genes or none at all..
> Thanks for any help,

 > sapply(1:2, function(x) names(ee[ee %over% GRfinalPeaks[x,],]))
[[1]]
[1] "FBgn0260642"

[[2]]
[1] "FBgn0000014" "FBgn0003944" "FBgn0015230" "FBgn0020556" "FBgn0051498"
[6] "FBgn0063261" "FBgn0084245" "FBgn0084688" "FBgn0085056"

You can then coerce that to whatever form you like.

Best,

Jim

>
> Patrick
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099

------------------------------

Message: 10
Date: Tue, 06 Aug 2013 20:35:20 +0400
From: Alexey Moskalev <amoskalev at list.ru>
To: Wolfgang Huber <whuber at embl.de>
Cc: bioconductor at r-project.org
Subject: Re: [BioC] request
Message-ID: <1375806920.474023715 at f117.i.mail.ru>
Content-Type: text/plain

 Dear??Wolfgang!
Greate, it works!
Thank you so much!
Alex

??????????????,  6 ?????????????? 2013, 17:32 +02:00 ???? Wolfgang Huber <whuber at embl.de>:
>Dear Alexey
>please type "plotPCA" into the R command line to see how the function computes the PCA, then have a look at the manual page of the functions "prcomp" and "screeplot".
>
>@all: I am not sure what would be a good user interface would be for modifying the "plotPCA" function so that it can return the 'pca' object for user inspection (such as desired by Alexey); currently it returns the 'trelliis' object as its return value.
>
>Best wishes
>Wolfgang
>
>
>
>On 6 Aug 2013, at 08:54, Alexey Moskalev < amoskalev at list.ru > wrote:
>
>> I am using DeSeq package to produce Principal components biplot on variance stabilized data for my RNASeq data. I was wondering if you advice me how to know Proportion of Variance for the first and the second Principal components using DeSeq?
>>      [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>>  Bioconductor at r-project.org
>>  https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:  http://news.gmane.org/gmane.science.biology.informatics.conductor
>

Sincerely,
Dr. Alexey Moskalev

Head of the Laboratory of Molecular Radiobiology and Gerontology
Institute of Biology, Komi Science Center of RAS,
Kommunisticheskaya St.28
167982, Syktyvkar
Russia

Blog: http://aging-genes.livejournal.com/

        [[alternative HTML version deleted]]

------------------------------

Message: 11
Date: Tue, 6 Aug 2013 17:12:12 +0000
From: "Taylor, Sean D" <sdtaylor at fhcrc.org>
To: "bioconductor at r-project.org" <bioconductor at r-project.org>
Subject: Re: [BioC] DNAStringSetList can't coerce a list?
Message-ID: <83AF5F78BA8BF748B81D11EF2FE1C46D05BED6 at adama.fhcrc.org>
Content-Type: text/plain

Nevermind. I restarted my R session and that seems to have helped.

From: Taylor, Sean D
Sent: Monday, August 05, 2013 4:59 PM
To: bioconductor at r-project.org
Cc: Pages, Herve (hpages at fhcrc.org)
Subject: RE: DNAStringSetList can't coerce a list?

Sorry, here is my sessionInfo()
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] XVector_0.1.0         BiocInstaller_1.11.4  ShortRead_1.19.10
 [4] latticeExtra_0.6-24   RColorBrewer_1.0-5    lattice_0.20-15
 [7] Rsamtools_1.13.27     Biostrings_2.29.14    GenomicRanges_1.13.35
[10] IRanges_1.19.20       BiocGenerics_0.7.3    magicaxis_1.5

loaded via a namespace (and not attached):
[1] Biobase_2.20.0 bitops_1.0-5   grid_3.0.1     hwriter_1.3    stats4_3.0.1
[6] tools_3.0.1    zlibbioc_1.6.0

From: Taylor, Sean D
Sent: Monday, August 05, 2013 4:58 PM
To: bioconductor at r-project.org<mailto:bioconductor at r-project.org>
Cc: Pages, Herve (hpages at fhcrc.org<mailto:hpages at fhcrc.org>)
Subject: DNAStringSetList can't coerce a list?

Hi Herve,

It seems like I used to be able to coerce a list of DNA String Sets into a DNAStringSetList. With the latest build though it seems like that is no longer the case:
> dna1 <- c("AAA", "AC", "", "T", "GGATA")
> dna2 <- c("G", "TT", "C")
> foo<-DNAStringSet(dna1)
> bar<-DNAStringSet(dna2)

> DNAStringSetList(foo, bar)
DNAStringSetList of length 2
[[1]] AAA AC  T GGATA
[[2]] G TT C

> baz<-list(foo, bar)
> DNAStringSetList(baz)
Error in IRanges:::new_XVectorList_from_list_of_XVector(tmp_class, x) :
  all elements in 'x' must be DNAString objects

Thanks,
Sean

Sean Taylor
Post-doctoral Fellow
Fred Hutchinson Cancer Research Center
206-667-5544

        [[alternative HTML version deleted]]

------------------------------

Message: 12
Date: Tue, 6 Aug 2013 13:52:33 -0400
From: Li Liu <liliu_1 at hotmail.com>
To: "bioconductor at r-project.org" <bioconductor at r-project.org>
Cc: "Rafael A. Irizarry" <rafa at jhu.edu>
Subject: [BioC] fRMA package
Message-ID: <BAY175-W1E98A7A235246849BDA1DD45D0 at phx.gbl>
Content-Type: text/plain

Hi,

I want to use the rRMA package in the Bioconductor. I can install the package successfully but when I run the library there is and error. Is there anybody who can help? Thanks.

Li

> biocLite("frma")
BioC_mirror: http://bioconductor.org
Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
Installing package(s) 'frma'
trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
Content type 'application/zip' length 262021 bytes (255 Kb)
opened URL
downloaded 255 Kb

package ?frma? successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Documents and Settings\li\Local Settings\Temp\RtmpYpJMpj\downloaded_packages

> library(frma)
Error in inDL(x, as.logical(local), as.logical(now), ...) :
  unable to load shared object 'C:/Program Files/R/R-3.0.1/library/affxparser/libs/i386/affxparser.dll':
  LoadLibrary failure:  The specified procedure could not be found.

Error: package or namespace load failed for ?frma?

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] BiocInstaller_1.10.3 affy_1.38.1          Biobase_2.20.1       BiocGenerics_0.6.0

loaded via a namespace (and not attached):
[1] affyio_1.28.0         MASS_7.3-28           preprocessCore_1.22.0
[4] tools_3.0.1           zlibbioc_1.6.0

        [[alternative HTML version deleted]]

------------------------------

Message: 13
Date: Tue, 6 Aug 2013 11:29:50 -0700
From: Michael Lawrence <lawrence.michael at gene.com>
To: Julian Gehring <julian.gehring at embl.de>
Cc: Bioconductor List <bioconductor at stat.math.ethz.ch>
Subject: Re: [BioC] ggbio: Data stored twice in 'GGbio' object
Message-ID:
        <CAOQ5NycMovHtU5Yy2p+GT8tDc6WVp377Gjdn0p4ATLt1GBFpow at mail.gmail.com>
Content-Type: text/plain

This is a flaw in the design of ggbio. It was a solution to the problem of
ggplot2 requiring a data.frame in the plot object, while ggbio would like
to keep the original data structure (like a GRanges) around. Probably the
correct solution is for ggbio to extend the ggplot object, or otherwise
represent the plot, and to perform the necessary reduction of the data when
the plot is rendered. This is how the ggsubplot package works, although it
is not changing the underlying data structure.

But the data is only stored *exactly* twice if the input data is a
data.frame. It's not very efficient to store the data twice, but my main
concern is the redundancy in the data model.

On Tue, Aug 6, 2013 at 2:33 AM, Julian Gehring <julian.gehring at embl.de>wrote:

> Hi,
>
> The 'ggbio::ggplot' (ggbio_1.9.7, R_2013-08-05 r63513) function seems to
> store its data twice.
>
>   library(ggbio)
>   df = data.frame(x = 1:10, y = rnorm(10))
>   p = ggbio::ggplot(data = df)
>   str(p)
>   identical(p at data, p at ggplot$data)  ## TRUE
>
> shows that the data 'df' is stored in p at data as well as p at ggplot$data.
>
> Especially for large data sets, this is inefficient.  Is there a good
> reason for this?
>
> Best wishes
> Julian
>
> ______________________________**_________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
> Search the archives: http://news.gmane.org/gmane.**
> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>

        [[alternative HTML version deleted]]

------------------------------

Message: 14
Date: Tue, 6 Aug 2013 11:36:53 -0700
From: Dan Tenenbaum <dtenenba at fhcrc.org>
To: Li Liu <liliu_1 at hotmail.com>
Cc: "Rafael A. Irizarry" <rafa at jhu.edu>,        "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] fRMA package
Message-ID:
        <CAF42j23xu7H+DaRXQNDfdc7ssjEB6StOvZdhqJaFS020A4dkGw at mail.gmail.com>
Content-Type: text/plain; charset=windows-1252

On Tue, Aug 6, 2013 at 10:52 AM, Li Liu <liliu_1 at hotmail.com> wrote:
> Hi,
>
> I want to use the rRMA package in the Bioconductor. I can install the package successfully but when I run the library there is and error. Is there anybody who can help? Thanks.
>
> Li
>
>
>> biocLite("frma")
> BioC_mirror: http://bioconductor.org
> Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
> Installing package(s) 'frma'
> trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
> Content type 'application/zip' length 262021 bytes (255 Kb)
> opened URL
> downloaded 255 Kb
>
> package ?frma? successfully unpacked and MD5 sums checked
>
> The downloaded binary packages are in
>         C:\Documents and Settings\li\Local Settings\Temp\RtmpYpJMpj\downloaded_packages
>
>> library(frma)
> Error in inDL(x, as.logical(local), as.logical(now), ...) :
>   unable to load shared object 'C:/Program Files/R/R-3.0.1/library/affxparser/libs/i386/affxparser.dll':
>   LoadLibrary failure:  The specified procedure could not be found.
>
> Error: package or namespace load failed for ?frma?
>

Are you on windows XP? If I recall correctly, the problem is that
affxparser does not always work on Windows XP. More information here:

https://stat.ethz.ch/pipermail/bioconductor/2013-March/051760.html

Dan

>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] BiocInstaller_1.10.3 affy_1.38.1          Biobase_2.20.1       BiocGenerics_0.6.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.28.0         MASS_7.3-28           preprocessCore_1.22.0
> [4] tools_3.0.1           zlibbioc_1.6.0
>
>
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

------------------------------

Message: 15
Date: Tue, 6 Aug 2013 20:33:55 +0100
From: Laurent Gatto <lg390 at cam.ac.uk>
To: Wolfgang Huber <whuber at embl.de>
Cc: Alexey Moskalev <amoskalev at list.ru>,        "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] request
Message-ID:
        <CA+uNOzhjc56h=wYr-FHm+F1JD2=p4fgrWSCQH7oD82csj0r8Jg at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On 6 August 2013 16:32, Wolfgang Huber <whuber at embl.de> wrote:
> @all: I am not sure what would be a good user interface would be for modifying the "plotPCA" function so that it can return the 'pca' object for user inspection (such as desired by Alexey); currently it returns the 'trelliis' object as its return value.

I have a similar function (pRoloc::plot2D) that invisibly returns the
prcomp(...)$x[, dims] matrix that used for plotting, where dims are
the PCs requested by the user (default being 1:2). I also report the
proportion of variance explained by these two components on the axes.

Best wishes,

Laurent

> Best wishes
>         Wolfgang
>
>
>
> On 6 Aug 2013, at 08:54, Alexey Moskalev <amoskalev at list.ru> wrote:
>
>> I am using DeSeq package to produce Principal components biplot on variance stabilized data for my RNASeq data. I was wondering if you advice me how to know Proportion of Variance for the first and the second Principal components using DeSeq?
>>       [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

--
Laurent Gatto
- http://proteome.sysbiol.cam.ac.uk/lgatto/
Cambridge Centre for Proteomics
- http://www.bio.cam.ac.uk/proteomics
Using R/Bioconductor for proteomics data analysis
- http://lgatto.github.io/RforProteomics/

------------------------------

Message: 16
Date: Tue, 06 Aug 2013 23:35:10 +0400
From: Alexey Moskalev <amoskalev at list.ru>
To: Laurent Gatto <lg390 at cam.ac.uk>
Cc: bioconductor at r-project.org  <bioconductor at r-project.org>
Subject: Re: [BioC] request
Message-ID: <1375817710.455500561 at f406.i.mail.ru>
Content-Type: text/plain

 Thanks a lot!

??????????????,  6 ?????????????? 2013, 20:33 +01:00 ???? Laurent Gatto <lg390 at cam.ac.uk>:
>On 6 August 2013 16:32, Wolfgang Huber < whuber at embl.de > wrote:
>> @all: I am not sure what would be a good user interface would be for modifying the "plotPCA" function so that it can return the 'pca' object for user inspection (such as desired by Alexey); currently it returns the 'trelliis' object as its return value.
>
>I have a similar function (pRoloc::plot2D) that invisibly returns the
>prcomp(...)$x[, dims] matrix that used for plotting, where dims are
>the PCs requested by the user (default being 1:2). I also report the
>proportion of variance explained by these two components on the axes.
>
>Best wishes,
>
>Laurent
>
>> Best wishes
>>         Wolfgang
>>
>>
>>
>> On 6 Aug 2013, at 08:54, Alexey Moskalev < amoskalev at list.ru > wrote:
>>
>>> I am using DeSeq package to produce Principal components biplot on variance stabilized data for my RNASeq data. I was wondering if you advice me how to know Proportion of Variance for the first and the second Principal components using DeSeq?
>>>       [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>>  Bioconductor at r-project.org
>>>  https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:  http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>>  Bioconductor at r-project.org
>>  https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:  http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>--
>Laurent Gatto
>-  http://proteome.sysbiol.cam.ac.uk/lgatto/
>Cambridge Centre for Proteomics
>-  http://www.bio.cam.ac.uk/proteomics
>Using R/Bioconductor for proteomics data analysis
>-  http://lgatto.github.io/RforProteomics/

Sincerely,
Dr. Alexey Moskalev

Head of the Laboratory of Molecular Radiobiology and Gerontology
Institute of Biology, Komi Science Center of RAS,
Kommunisticheskaya St.28
167982, Syktyvkar
Russia

Blog: http://aging-genes.livejournal.com/

        [[alternative HTML version deleted]]

------------------------------

Message: 17
Date: Tue, 6 Aug 2013 21:49:17 +0200
From: Wolfgang Huber <whuber at embl.de>
To: Laurent Gatto <lg390 at cam.ac.uk>
Cc: Alexey Moskalev <amoskalev at list.ru>,        "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] request
Message-ID: <1737A4C1-55CF-4DFC-9994-389152B59ACA at embl.de>
Content-Type: text/plain; charset=iso-8859-1

Dear Laurent
in pRoloc::plot2D the plot is a side effect (via graphics::plot), and therefore you are free to return something else; while in the function discussed below the plot (a 'trellis' object) is the return value, which then usually is rendered via 'print.trellis'.

(One could stick additional information like the PCA loadings and eigenvalues into the same (S3-)object, initially I thought this was ugly but maybe it's the way to go.)
        Best wishes
        Wolfgang

On Aug 6, 2013, at 9:33 pm, Laurent Gatto <lg390 at cam.ac.uk> wrote:

> On 6 August 2013 16:32, Wolfgang Huber <whuber at embl.de> wrote:
>> @all: I am not sure what would be a good user interface would be for modifying the "plotPCA" function so that it can return the 'pca' object for user inspection (such as desired by Alexey); currently it returns the 'trelliis' object as its return value.
>
> I have a similar function (pRoloc::plot2D) that invisibly returns the
> prcomp(...)$x[, dims] matrix that used for plotting, where dims are
> the PCs requested by the user (default being 1:2). I also report the
> proportion of variance explained by these two components on the axes.
>
> Best wishes,
>
> Laurent
>
>> Best wishes
>>        Wolfgang
>>
>>
>>
>> On 6 Aug 2013, at 08:54, Alexey Moskalev <amoskalev at list.ru> wrote:
>>
>>> I am using DeSeq package to produce Principal components biplot on variance stabilized data for my RNASeq data. I was wondering if you advice me how to know Proportion of Variance for the first and the second Principal components using DeSeq?
>>>      [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> Laurent Gatto
> - http://proteome.sysbiol.cam.ac.uk/lgatto/
> Cambridge Centre for Proteomics
> - http://www.bio.cam.ac.uk/proteomics
> Using R/Bioconductor for proteomics data analysis
> - http://lgatto.github.io/RforProteomics/

------------------------------

Message: 18
Date: Tue, 6 Aug 2013 12:53:30 -0700
From: Steve Lianoglou <lianoglou.steve at gene.com>
To: Wolfgang Huber <whuber at embl.de>
Cc: Alexey Moskalev <amoskalev at list.ru>,        "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] request
Message-ID:
        <CAHA9McP6iS2R4nr-G2ZGzgb_k2e0k0wn=CJNLCPhT-3uQCXNDw at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi,

On Tue, Aug 6, 2013 at 12:49 PM, Wolfgang Huber <whuber at embl.de> wrote:
> Dear Laurent
> in pRoloc::plot2D the plot is a side effect (via graphics::plot), and therefore you are free to return something else; while in the function discussed below the plot (a 'trellis' object) is the return value, which then usually is rendered via 'print.trellis'.
>
> (One could stick additional information like the PCA loadings and eigenvalues into the same (S3-)object, initially I thought this was ugly but maybe it's the way to go.)

Along these lines: is it considered "bad form" to just add a "pca"
`attr`-ibute to the xyplot object you are returning? eg:

plotPCA <- function(...) {
  ## ...
  out <- xyplot(PC2 ~ PC1, ...)
  attr(out, 'pca') <- pca
  invisible(out) ## or not invisible
}

--
Steve Lianoglou
Computational Biologist
Bioinformatics and Computational Biology
Genentech

------------------------------

Message: 19
Date: Tue, 6 Aug 2013 13:13:59 -0700
From: Michael Lawrence <lawrence.michael at gene.com>
To: "Cook, Malcolm" <MEC at stowers.org>
Cc: Michael Lawrence <lawrence.michael at gene.com>,
        "bioconductor at r-project.org" <bioconductor at r-project.org>
Subject: Re: [BioC] ggbio facet_gr example sought
Message-ID:
        <CAOQ5NydoTcrp_x4VaUiNu1bOyPzZmdn-NTe58cUuxDgPxpXBEA at mail.gmail.com>
Content-Type: text/plain

I'm pretty sure that you can just pass a GRanges to the facets argument,
but I haven't tried it.

On Sun, Aug 4, 2013 at 4:50 PM, Cook, Malcolm <MEC at stowers.org> wrote:

>  Hi,
> I am unable to find any examples of facet_gr argument to autoplot.
>
> It is mentioned in
> http://bioconductor.org/packages/2.12/bioc/manuals/ggbio/man/ggbio.pdf on
> page 9 as:
>
> Sometime, we need to view different regions, so we also have a facet_gr
> argument which
> accept a GRanges. If this is provided, it will override the default
> seqnames and use provided
> region to facet the graphics, this might be useful for different gene
> centric views.
>
>
>  However there is no further example of its use, and it does not appear in
> the list of formals, and it does not appear at all in
> http://bioconductor.org/packages/2.12/bioc/vignettes/ggbio/inst/doc/ggbio.pdf
>
>  And the only google hits suggest this feature is out deprecated.
>
>  Am I missing something?
>
>  Is there a contemporary equivalent?  Is there some way to facet on
> genomic range?  Any examples out there?
>
>  Thanks,
>
> ~ malcolm_cook at stowers.org
>

        [[alternative HTML version deleted]]

------------------------------

Message: 20
Date: Tue, 6 Aug 2013 22:19:24 +0200
From: Julian Gehring <julian.gehring at embl.de>
To: Michael Lawrence <lawrence.michael at gene.com>
Cc: Bioconductor List <bioconductor at stat.math.ethz.ch>
Subject: Re: [BioC] ggbio: Data stored twice in 'GGbio' object
Message-ID: <52015A4C.8050603 at embl.de>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed

Hi Michael,

I agree that the main problem is that the data is practically stored
twice, irrespective whether this is done in the form of two identical or
similar object.  Especially having the large amounts of genomic data in
mind, this way of handling data may not scale well.

Best wishes
Julian

On 08/06/2013 08:29 PM, Michael Lawrence wrote:
> This is a flaw in the design of ggbio. It was a solution to the problem of
> ggplot2 requiring a data.frame in the plot object, while ggbio would like
> to keep the original data structure (like a GRanges) around. Probably the
> correct solution is for ggbio to extend the ggplot object, or otherwise
> represent the plot, and to perform the necessary reduction of the data when
> the plot is rendered. This is how the ggsubplot package works, although it
> is not changing the underlying data structure.
>
> But the data is only stored *exactly* twice if the input data is a
> data.frame. It's not very efficient to store the data twice, but my main
> concern is the redundancy in the data model.
>
>
>
>
> On Tue, Aug 6, 2013 at 2:33 AM, Julian Gehring <julian.gehring at embl.de>wrote:
>
>> Hi,
>>
>> The 'ggbio::ggplot' (ggbio_1.9.7, R_2013-08-05 r63513) function seems to
>> store its data twice.
>>
>>    library(ggbio)
>>    df = data.frame(x = 1:10, y = rnorm(10))
>>    p = ggbio::ggplot(data = df)
>>    str(p)
>>    identical(p at data, p at ggplot$data)  ## TRUE
>>
>> shows that the data 'df' is stored in p at data as well as p at ggplot$data.
>>
>> Especially for large data sets, this is inefficient.  Is there a good
>> reason for this?
>>
>> Best wishes
>> Julian
>>
>> ______________________________**_________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>> Search the archives: http://news.gmane.org/gmane.**
>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>
>

------------------------------

Message: 21
Date: Tue, 6 Aug 2013 18:56:51 -0400
From: Li Liu <liliu_1 at hotmail.com>
To: Dan Tenenbaum <dtenenba at fhcrc.org>
Cc: "rafa at jhu.edu" <rafa at jhu.edu>,      "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] fRMA package
Message-ID: <BAY175-W24939543F62B119AAF3CABD45D0 at phx.gbl>
Content-Type: text/plain

Hi Dan,

Thank you for the information. I tried to install the package in another computer with windows 7 but it still doesn't work. Is there any other idea?

Thanks.

Li

> Date: Tue, 6 Aug 2013 11:36:53 -0700
> Subject: Re: [BioC] fRMA package
> From: dtenenba at fhcrc.org
> To: liliu_1 at hotmail.com
> CC: bioconductor at r-project.org; rafa at jhu.edu
>
> On Tue, Aug 6, 2013 at 10:52 AM, Li Liu <liliu_1 at hotmail.com> wrote:
> > Hi,
> >
> > I want to use the rRMA package in the Bioconductor. I can install the package successfully but when I run the library there is and error. Is there anybody who can help? Thanks.
> >
> > Li
> >
> >
> >> biocLite("frma")
> > BioC_mirror: http://bioconductor.org
> > Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
> > Installing package(s) 'frma'
> > trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
> > Content type 'application/zip' length 262021 bytes (255 Kb)
> > opened URL
> > downloaded 255 Kb
> >
> > package ?frma? successfully unpacked and MD5 sums checked
> >
> > The downloaded binary packages are in
> >         C:\Documents and Settings\li\Local Settings\Temp\RtmpYpJMpj\downloaded_packages
> >
> >> library(frma)
> > Error in inDL(x, as.logical(local), as.logical(now), ...) :
> >   unable to load shared object 'C:/Program Files/R/R-3.0.1/library/affxparser/libs/i386/affxparser.dll':
> >   LoadLibrary failure:  The specified procedure could not be found.
> >
> > Error: package or namespace load failed for ?frma?
> >
>
> Are you on windows XP? If I recall correctly, the problem is that
> affxparser does not always work on Windows XP. More information here:
>
> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051760.html
>
> Dan
>
>
> >> sessionInfo()
> > R version 3.0.1 (2013-05-16)
> > Platform: i386-w64-mingw32/i386 (32-bit)
> >
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> > [5] LC_TIME=English_United States.1252
> >
> > attached base packages:
> > [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] BiocInstaller_1.10.3 affy_1.38.1          Biobase_2.20.1       BiocGenerics_0.6.0
> >
> > loaded via a namespace (and not attached):
> > [1] affyio_1.28.0         MASS_7.3-28           preprocessCore_1.22.0
> > [4] tools_3.0.1           zlibbioc_1.6.0
> >
> >
> >         [[alternative HTML version deleted]]
> >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

        [[alternative HTML version deleted]]

------------------------------

Message: 22
Date: Tue, 6 Aug 2013 15:59:16 -0700
From: Dan Tenenbaum <dtenenba at fhcrc.org>
To: Li Liu <liliu_1 at hotmail.com>
Cc: "bioconductor at r-project.org" <bioconductor at r-project.org>,
        "rafa at jhu.edu" <rafa at jhu.edu>
Subject: Re: [BioC] fRMA package
Message-ID:
        <CAF42j21Gw-zbDhtwbffSNs9_jJb8Pry-NXfBgzzTMwmObfBbAQ at mail.gmail.com>
Content-Type: text/plain; charset=windows-1252

On Tue, Aug 6, 2013 at 3:56 PM, Li Liu <liliu_1 at hotmail.com> wrote:
> Hi Dan,
>
> Thank you for the information. I tried to install the package in another
> computer with windows 7 but it still doesn't work. Is there any other idea?
>

Can you send the exact commands you tried and R's response? Also the
output of sessionInfo().

Thanks,
Dan

> Thanks.
>
> Li
>
>> Date: Tue, 6 Aug 2013 11:36:53 -0700
>> Subject: Re: [BioC] fRMA package
>> From: dtenenba at fhcrc.org
>> To: liliu_1 at hotmail.com
>> CC: bioconductor at r-project.org; rafa at jhu.edu
>>
>> On Tue, Aug 6, 2013 at 10:52 AM, Li Liu <liliu_1 at hotmail.com> wrote:
>> > Hi,
>> >
>> > I want to use the rRMA package in the Bioconductor. I can install the
>> > package successfully but when I run the library there is and error. Is there
>> > anybody who can help? Thanks.
>> >
>> > Li
>> >
>> >
>> >> biocLite("frma")
>> > BioC_mirror: http://bioconductor.org
>> > Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
>> > Installing package(s) 'frma'
>> > trying URL
>> > 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
>> > Content type 'application/zip' length 262021 bytes (255 Kb)
>> > opened URL
>> > downloaded 255 Kb
>> >
>> > package ?frma? successfully unpacked and MD5 sums checked
>> >
>> > The downloaded binary packages are in
>> > C:\Documents and Settings\li\Local
>> > Settings\Temp\RtmpYpJMpj\downloaded_packages
>> >
>> >> library(frma)
>> > Error in inDL(x, as.logical(local), as.logical(now), ...) :
>> > unable to load shared object 'C:/Program
>> > Files/R/R-3.0.1/library/affxparser/libs/i386/affxparser.dll':
>> > LoadLibrary failure: The specified procedure could not be found.
>> >
>> > Error: package or namespace load failed for ?frma?
>> >
>>
>> Are you on windows XP? If I recall correctly, the problem is that
>> affxparser does not always work on Windows XP. More information here:
>>
>> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051760.html
>>
>> Dan
>>
>>
>> >> sessionInfo()
>> > R version 3.0.1 (2013-05-16)
>> > Platform: i386-w64-mingw32/i386 (32-bit)
>> >
>> > locale:
>> > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>> > States.1252
>> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> > [5] LC_TIME=English_United States.1252
>> >
>> > attached base packages:
>> > [1] parallel stats graphics grDevices utils datasets methods base
>> >
>> > other attached packages:
>> > [1] BiocInstaller_1.10.3 affy_1.38.1 Biobase_2.20.1 BiocGenerics_0.6.0
>> >
>> > loaded via a namespace (and not attached):
>> > [1] affyio_1.28.0 MASS_7.3-28 preprocessCore_1.22.0
>> > [4] tools_3.0.1 zlibbioc_1.6.0
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> > http://news.gmane.org/gmane.science.biology.informatics.conductor

------------------------------

Message: 23
Date: Tue, 6 Aug 2013 16:47:48 -0700
From: Michael Lawrence <lawrence.michael at gene.com>
To: Patrick Schorderet <patrick.schorderet at molbio.mgh.harvard.edu>
Cc: "bioconductor at r-project.org" <bioconductor at r-project.org>
Subject: Re: [BioC] Extracting overlapping gene names from a list of
        peaks
Message-ID:
        <CAOQ5NydNottmZu_P_ztHsc9kksXHyy+cGpRwDs52ZDQmrhb1oA at mail.gmail.com>
Content-Type: text/plain

You can efficiently get a list like this:

hits <- findOverlaps(GRfinalPeaks, ee)
listOfGenesByPeak <- split(names(ee)[subjectHits(hits)], queryHits(hits))

But maybe what you want is a long-form table:
DataFrame(peak = queryHits(hits), gene = names(ee)[subjectHits(hits)])

On Tue, Aug 6, 2013 at 8:48 AM, Patrick Schorderet <
patrick.schorderet at molbio.mgh.harvard.edu> wrote:

> Dear all,
>
> I have a list of peaks from ChIPseq experiments. Now I am trying to find
> to over which genes these peaks overlap (and extract the gene name).
> I'm sure this should be pretty easy, but I am just starting with
> bioconductor, so some concepts are still vague.
> Here's what I did so far:
>
> # Not run because it is installed (as well as other packages)
> # source("http://bioconductor.org/biocLite.R")
> # biocLite("TxDb.Dmelanogaster.UCSC.dm3.ensGene")
> # Load the Dmelanogaster genome
> library(TxDb.Dmelanogaster.UCSC.dm3.ensGene, quietly = TRUE)
> txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
> ee <- exonsBy(txdb, "gene")
>
> # Load an subset of my peaks for the sake of the example
> finalPeaks <- rbind(c("chr3R", "2788500", "2842850", "2815675", "54350"),
> (c("chr3R", "12484350", "12661350", "12572850", "177000")))
> rownames(finalPeaks) <- c("Peak 1", "Peak 2")
> colnames(finalPeaks) <- c("chr", "start", "end", "center", "length")
>
> # Create a GRanges object with the file I have
> # finalPeaks is a matrix with rows being individual peaks and columns <-
> c()
> GRfinalPeaks <- GRanges(finalPeaks[,1], IRanges(start =
> as.numeric(finalPeaks[,2]), end = as.numeric(finalPeaks[,3])))
>
> I'm stuck from here. What I'd like is to get, for each peak, the
> overlapping genes.
> For example, an output that would be:
>
> Peak1: GeneA, GeneB, GeneC, etc
> Peak2: GeneD
>
> Also, I don't know how simple it is because (as specified in the output
> example) one peak can overlap several genes or none at all..
> Thanks for any help,
>
> Patrick
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

        [[alternative HTML version deleted]]

------------------------------

Message: 24
Date: Tue, 6 Aug 2013 21:55:14 -0400
From: Li Liu <liliu_1 at hotmail.com>
To: Dan Tenenbaum <dtenenba at fhcrc.org>
Cc: "rafa at jhu.edu" <rafa at jhu.edu>,      "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] fRMA package
Message-ID: <BAY175-W20505596E7E72251BA6ACD45E0 at phx.gbl>
Content-Type: text/plain

HI Dan,

The following is what I run in R studio and the sessionInfo() output in window 7. Thanks.

Li

> source("http://bioconductor.org/biocLite.R")
> biocLite("frma")
BioC_mirror: http://bioconductor.org
Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
Installing package(s) 'frma'
trying URL 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
Content type 'application/zip' length 262021 bytes (255 Kb)
opened URL
downloaded 255 Kb

package ??rma?successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\lucy\AppData\Local\Temp\Rtmpw3cXZ7\downloaded_packages
Warning message:
installed directory not writable, cannot update packages 'class', 'foreign', 'MASS',
  'mgcv', 'nlme', 'nnet', 'spatial'

> library(frma)
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :
  there is no package called ??enomicRanges?Error: package or namespace load failed for ??rma?

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] affy_1.38.1          Biobase_2.20.1       BiocGenerics_0.6.0   BiocInstaller_1.10.3

loaded via a namespace (and not attached):
[1] affxparser_1.32.3     affyio_1.28.0         IRanges_1.18.2
[4] MASS_7.3-26           preprocessCore_1.22.0 stats4_3.0.1
[7] tools_3.0.1           zlibbioc_1.6.0

> Date: Tue, 6 Aug 2013 15:59:16 -0700
> Subject: Re: [BioC] fRMA package
> From: dtenenba at fhcrc.org
> To: liliu_1 at hotmail.com
> CC: dtenenba at fhcrc.org; bioconductor at r-project.org; rafa at jhu.edu
>
> On Tue, Aug 6, 2013 at 3:56 PM, Li Liu <liliu_1 at hotmail.com> wrote:
> > Hi Dan,
> >
> > Thank you for the information. I tried to install the package in another
> > computer with windows 7 but it still doesn't work. Is there any other idea?
> >
>
> Can you send the exact commands you tried and R's response? Also the
> output of sessionInfo().
>
> Thanks,
> Dan
>
>
> > Thanks.
> >
> > Li
> >
> >> Date: Tue, 6 Aug 2013 11:36:53 -0700
> >> Subject: Re: [BioC] fRMA package
> >> From: dtenenba at fhcrc.org
> >> To: liliu_1 at hotmail.com
> >> CC: bioconductor at r-project.org; rafa at jhu.edu
> >>
> >> On Tue, Aug 6, 2013 at 10:52 AM, Li Liu <liliu_1 at hotmail.com> wrote:
> >> > Hi,
> >> >
> >> > I want to use the rRMA package in the Bioconductor. I can install the
> >> > package successfully but when I run the library there is and error. Is there
> >> > anybody who can help? Thanks.
> >> >
> >> > Li
> >> >
> >> >
> >> >> biocLite("frma")
> >> > BioC_mirror: http://bioconductor.org
> >> > Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
> >> > Installing package(s) 'frma'
> >> > trying URL
> >> > 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
> >> > Content type 'application/zip' length 262021 bytes (255 Kb)
> >> > opened URL
> >> > downloaded 255 Kb
> >> >
> >> > package ??frma?? successfully unpacked and MD5 sums checked
> >> >
> >> > The downloaded binary packages are in
> >> > C:\Documents and Settings\li\Local
> >> > Settings\Temp\RtmpYpJMpj\downloaded_packages
> >> >
> >> >> library(frma)
> >> > Error in inDL(x, as.logical(local), as.logical(now), ...) :
> >> > unable to load shared object 'C:/Program
> >> > Files/R/R-3.0.1/library/affxparser/libs/i386/affxparser.dll':
> >> > LoadLibrary failure: The specified procedure could not be found.
> >> >
> >> > Error: package or namespace load failed for ??frma??
> >> >
> >>
> >> Are you on windows XP? If I recall correctly, the problem is that
> >> affxparser does not always work on Windows XP. More information here:
> >>
> >> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051760.html
> >>
> >> Dan
> >>
> >>
> >> >> sessionInfo()
> >> > R version 3.0.1 (2013-05-16)
> >> > Platform: i386-w64-mingw32/i386 (32-bit)
> >> >
> >> > locale:
> >> > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> >> > States.1252
> >> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >> > [5] LC_TIME=English_United States.1252
> >> >
> >> > attached base packages:
> >> > [1] parallel stats graphics grDevices utils datasets methods base
> >> >
> >> > other attached packages:
> >> > [1] BiocInstaller_1.10.3 affy_1.38.1 Biobase_2.20.1 BiocGenerics_0.6.0
> >> >
> >> > loaded via a namespace (and not attached):
> >> > [1] affyio_1.28.0 MASS_7.3-28 preprocessCore_1.22.0
> >> > [4] tools_3.0.1 zlibbioc_1.6.0
> >> >
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> >
> >> > _______________________________________________
> >> > Bioconductor mailing list
> >> > Bioconductor at r-project.org
> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> > Search the archives:
> >> > http://news.gmane.org/gmane.science.biology.informatics.conductor

        [[alternative HTML version deleted]]

------------------------------

Message: 25
Date: Tue, 6 Aug 2013 19:31:27 -0700
From: Dan Tenenbaum <dtenenba at fhcrc.org>
To: Li Liu <liliu_1 at hotmail.com>
Cc: "rafa at jhu.edu" <rafa at jhu.edu>,      "bioconductor at r-project.org"
        <bioconductor at r-project.org>
Subject: Re: [BioC] fRMA package
Message-ID:
        <CAF42j20u3HM=Fa=1NqFpCGadc9YwtWiv_Pqn2Qo8LVox9UzGiA at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Tue, Aug 6, 2013 at 6:55 PM, Li Liu <liliu_1 at hotmail.com> wrote:
> HI Dan,
>
> The following is what I run in R studio and the sessionInfo() output in
> window 7. Thanks.
>
> Li
>
>> source("http://bioconductor.org/biocLite.R")
>> biocLite("frma")
> BioC_mirror: http://bioconductor.org
> Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version 3.0.1.
> Installing package(s) 'frma'
> trying URL
> 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
> Content type 'application/zip' length 262021 bytes (255 Kb)
> opened URL
> downloaded 255 Kb
>
> package ?rma?successfully unpacked and MD5 sums checked
>
> The downloaded binary packages are in
> C:\Users\lucy\AppData\Local\Temp\Rtmpw3cXZ7\downloaded_packages
> Warning message:
> installed directory not writable, cannot update packages 'class', 'foreign',
> 'MASS',
>   'mgcv', 'nlme', 'nnet', 'spatial'
>
>> library(frma)
> Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) :
>   there is no package called ?enomicRanges?Error: package or namespace load
> failed for ?rma?
>

This error is telling you that you need to install GenomicRanges, so:

biocLite("GenomicRanges")

Then try:

library(fRMA)

again.

Dan

>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
> [3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
> [5] LC_TIME=English_Canada.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
> [1] affy_1.38.1          Biobase_2.20.1       BiocGenerics_0.6.0
> BiocInstaller_1.10.3
>
> loaded via a namespace (and not attached):
> [1] affxparser_1.32.3     affyio_1.28.0         IRanges_1.18.2
> [4] MASS_7.3-26           preprocessCore_1.22.0 stats4_3.0.1
> [7] tools_3.0.1           zlibbioc_1.6.0
>
>
>
>
>
>
>> Date: Tue, 6 Aug 2013 15:59:16 -0700
>
>> Subject: Re: [BioC] fRMA package
>> From: dtenenba at fhcrc.org
>> To: liliu_1 at hotmail.com
>> CC: dtenenba at fhcrc.org; bioconductor at r-project.org; rafa at jhu.edu
>
>>
>> On Tue, Aug 6, 2013 at 3:56 PM, Li Liu <liliu_1 at hotmail.com> wrote:
>> > Hi Dan,
>> >
>> > Thank you for the information. I tried to install the package in another
>> > computer with windows 7 but it still doesn't work. Is there any other
>> > idea?
>> >
>>
>> Can you send the exact commands you tried and R's response? Also the
>> output of sessionInfo().
>>
>> Thanks,
>> Dan
>>
>>
>> > Thanks.
>> >
>> > Li
>> >
>> >> Date: Tue, 6 Aug 2013 11:36:53 -0700
>> >> Subject: Re: [BioC] fRMA package
>> >> From: dtenenba at fhcrc.org
>> >> To: liliu_1 at hotmail.com
>> >> CC: bioconductor at r-project.org; rafa at jhu.edu
>> >>
>> >> On Tue, Aug 6, 2013 at 10:52 AM, Li Liu <liliu_1 at hotmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> > I want to use the rRMA package in the Bioconductor. I can install the
>> >> > package successfully but when I run the library there is and error.
>> >> > Is there
>> >> > anybody who can help? Thanks.
>> >> >
>> >> > Li
>> >> >
>> >> >
>> >> >> biocLite("frma")
>> >> > BioC_mirror: http://bioconductor.org
>> >> > Using Bioconductor version 2.12 (BiocInstaller 1.10.3), R version
>> >> > 3.0.1.
>> >> > Installing package(s) 'frma'
>> >> > trying URL
>> >> >
>> >> > 'http://bioconductor.org/packages/2.12/bioc/bin/windows/contrib/3.0/frma_1.12.0.zip'
>> >> > Content type 'application/zip' length 262021 bytes (255 Kb)
>> >> > opened URL
>> >> > downloaded 255 Kb
>> >> >
>> >> > package ?frma? successfully unpacked and MD5 sums checked
>> >> >
>> >> > The downloaded binary packages are in
>> >> > C:\Documents and Settings\li\Local
>> >> > Settings\Temp\RtmpYpJMpj\downloaded_packages
>> >> >
>> >> >> library(frma)
>> >> > Error in inDL(x, as.logical(local), as.logical(now), ...) :
>> >> > unable to load shared object 'C:/Program
>> >> > Files/R/R-3.0.1/library/affxparser/libs/i386/affxparser.dll':
>> >> > LoadLibrary failure: The specified procedure could not be found.
>> >> >
>> >> > Error: package or namespace load failed for ?frma?
>> >> >
>> >>
>> >> Are you on windows XP? If I recall correctly, the problem is that
>> >> affxparser does not always work on Windows XP. More information here:
>> >>
>> >> https://stat.ethz.ch/pipermail/bioconductor/2013-March/051760.html
>> >>
>> >> Dan
>> >>
>> >>
>> >> >> sessionInfo()
>> >> > R version 3.0.1 (2013-05-16)
>> >> > Platform: i386-w64-mingw32/i386 (32-bit)
>> >> >
>> >> > locale:
>> >> > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>> >> > States.1252
>> >> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> >> > [5] LC_TIME=English_United States.1252
>> >> >
>> >> > attached base packages:
>> >> > [1] parallel stats graphics grDevices utils datasets methods base
>> >> >
>> >> > other attached packages:
>> >> > [1] BiocInstaller_1.10.3 affy_1.38.1 Biobase_2.20.1
>> >> > BiocGenerics_0.6.0
>> >> >
>> >> > loaded via a namespace (and not attached):
>> >> > [1] affyio_1.28.0 MASS_7.3-28 preprocessCore_1.22.0
>> >> > [4] tools_3.0.1 zlibbioc_1.6.0
>> >> >
>> >> >
>> >> > [[alternative HTML version deleted]]
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Bioconductor mailing list
>> >> > Bioconductor at r-project.org
>> >> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> > Search the archives:
>> >> > http://news.gmane.org/gmane.science.biology.informatics.conductor

------------------------------

Message: 26
Date: Tue, 6 Aug 2013 23:17:39 -0400
From: Tengfei Yin <yintengfei at gmail.com>
To: Michael Lawrence <lawrence.michael at gene.com>
Cc: "bioconductor at r-project.org" <bioconductor at r-project.org>
Subject: Re: [BioC] ggbio facet_gr example sought
Message-ID:
        <CAPJsq9=gs1QRGGOay-N3pfybBHRfjsco40_dFMm3_owBvzR4Fg at mail.gmail.com>
Content-Type: text/plain

Hi Michael and Malcolm,

Sorry for the late reply, that's actually a bug in ggbio now. I did
deprecate facet_gr() function, but the feature is kept, it is supposed
(used) to work when you pass GRanges to arguments 'facets', either in
autoplot() or the low level API.  Some of my updates broke this feature.
Thanks for pointing this out.

I will work on fixing this bug.

Tengfei

Tengfei

On Tue, Aug 6, 2013 at 4:13 PM, Michael Lawrence
<lawrence.michael at gene.com>wrote:

> I'm pretty sure that you can just pass a GRanges to the facets argument,
> but I haven't tried it.
>
>
> On Sun, Aug 4, 2013 at 4:50 PM, Cook, Malcolm <MEC at stowers.org> wrote:
>
>>  Hi,
>> I am unable to find any examples of facet_gr argument to autoplot.
>>
>> It is mentioned in
>> http://bioconductor.org/packages/2.12/bioc/manuals/ggbio/man/ggbio.pdfon page 9 as:
>>
>> Sometime, we need to view different regions, so we also have a facet_gr
>> argument which
>> accept a GRanges. If this is provided, it will override the default
>> seqnames and use provided
>> region to facet the graphics, this might be useful for different gene
>> centric views.
>>
>>
>>  However there is no further example of its use, and it does not appear
>> in the list of formals, and it does not appear at all in
>> http://bioconductor.org/packages/2.12/bioc/vignettes/ggbio/inst/doc/ggbio.pdf
>>
>>  And the only google hits suggest this feature is out deprecated.
>>
>>  Am I missing something?
>>
>>  Is there a contemporary equivalent?  Is there some way to facet on
>> genomic range?  Any examples out there?
>>
>>  Thanks,
>>
>> ~ malcolm_cook at stowers.org
>>
>
>

        [[alternative HTML version deleted]]

------------------------------

Message: 27
Date: Tue, 6 Aug 2013 21:02:21 +0800
From: "joseph" <joseph.houjue at gmail.com>
To: <bioconductor at r-project.org>
Subject: [BioC] ??:  an error in AnnotationForge
Message-ID: <001201ce92a5$34c70380$9e550a80$@gmail.com>
Content-Type: text/plain

Dear Marc Carlson,

I??m using AnnotationForge to make my customized package, however, in
affymetrix annotation file, some probes correspond to genes have more gene
symbols, like this, ??11715100_at?? to ??HIST1H3A /// HIST1H3B /// HIST1H3C
/// HIST1H3D /// HIST1H3E /// HIST1H3F /// HIST1H3G /// HIST1H3H ///
HIST1H3I /// HIST1H3J??.

So that, when I make package:

makeDBPackage("HUMANCHIP_DB",

              affy = TRUE,

              prefix = "primeview",

              fileName = "E:/Microarray Data/Affy array
database/PrimeView.na33.annot.csv",

              baseMapType = "ug",

              version = "1.0.0",

              manufacturer = "Joseph",

              chipName = "primeview")

Even it works, but this probes could not be matched to expression data, when
I annotate it, the output looks like :

                 Gene Symbol

11715100_at      11715100_at

11715104_s_at    OTOP2

11715105_at      C17orf78

Please tell me how to fix it out. Thanks!

Joe

        [[alternative HTML version deleted]]

------------------------------

Message: 28
Date: Tue, 6 Aug 2013 12:13:48 +0200
From: Nogales Vilardell <cressi.nogales at gmail.com>
To: bioconductor at r-project.org
Subject: [BioC] beadarray library: perBeadFile
Message-ID:
        <CA+J=4N2WdbdrtSigBfK0PAQgDVksfhbf5qvNWXXb4F5=UdV23A at mail.gmail.com>
Content-Type: text/plain

Hi to everybody,

I have started recently to analyse Illumina Bead Chip data. I started
without any kind of problem and I was glad with the results until I saw
that comment from the beadarray library authors:

"iScan come in a different format,.... there are two images of each array
section (along with two .locs files), which are labeled Swath1 and Swath2"

I don't have those files, I have only the perBeadFile.txt and after the
authors wrote: "Given this, simply reading the bead-level text file will
result in any function that uses bead locations performing undesirably"

And now I am afraid if I did something wrong.. I am not receving any
message when I read my files with the function readIllumina and they have
also wrote that I should receive a message from the function advising me to
use processSwathData.

Does anybody know in which cases using perBeadFile.txt is ok and in which
cases no?

thanks a lot for the help

Best wishes

Mireia

        [[alternative HTML version deleted]]

------------------------------

Message: 29
Date: Tue, 6 Aug 2013 21:08:08 +0800
From: "joseph" <joseph.houjue at gmail.com>
To: <bioconductor at r-project.org>
Subject: [BioC] ??:  an error in AnnotationForge
Message-ID: <001f01ce92a6$034282c0$09c78840$@gmail.com>
Content-Type: text/plain

One more question, if I make this db successfully, when I annotate
expression data, there are still some probeID could be annotated with gene
symbol, still show ProbeID. However, these probes truly have gene symbol
names. How to fix it out?

PS. Administrator guy, please approve  this mail, maybe we had met before, I
worked in SCHARP.

Jue Hou,Ph.D.

Research Assistant

Center of Medical Physics and Technology

Hefei Institutes of Physical Science

Chinese Academy of Sciences

No.350 Shushanhu Road,Shushan District,Heifei,P.R. China

Tel. +86551-65595385

Email:  <mailto:joseph.houjue at gmail.com> joseph.houjue at gmail.com;
<mailto:houjue00722 at sina.com> houjue00722 at sina.com

??????: joseph [mailto:joseph.houjue at gmail.com]
????????: 2013??8??4?? 18:35
??????: bioconductor at r-project.org
????: Re: [BioC] an error in AnnotationForge
??????: ??

Dear Marc Carlson,

I??m using AnnotationForge to make my customized package, however, in
affymetrix annotation file, some probes correspond to genes have more gene
symbols, like this, ??11715100_at?? to ??HIST1H3A /// HIST1H3B /// HIST1H3C
/// HIST1H3D /// HIST1H3E /// HIST1H3F /// HIST1H3G /// HIST1H3H ///
HIST1H3I /// HIST1H3J??.

So that, when I make package:

makeDBPackage("HUMANCHIP_DB",

              affy = TRUE,

              prefix = "primeview",

              fileName = "E:/Microarray Data/Affy array
database/PrimeView.na33.annot.csv",

              baseMapType = "ug",

              version = "1.0.0",

              manufacturer = "Joseph",

              chipName = "primeview")

Even it works, but this probes could not be matched to expression data, when
I annotate it, the output looks like :

                 Gene Symbol

11715100_at      11715100_at

11715104_s_at    OTOP2

11715105_at      C17orf78

Please tell me how to fix it out. Thanks!

Joe

        [[alternative HTML version deleted]]

------------------------------

Message: 30
Date: Tue, 6 Aug 2013 06:41:05 -0700
From: Datong Wang <datongwang2007 at yahoo.com>
To: "bioconductor at stat.math.ethz.ch" <bioconductor at stat.math.ethz.ch>
Subject: [BioC] topGO question
Message-ID:
        <1375796465.32356.YahooMailNeo at web161006.mail.bf1.yahoo.com>
Content-Type: text/plain

Hi adrian,

I used topGO to analyze my data and get the following results:

GO.ID Annotated Significant Expected pvalue
GO:0016746 123 3 7.13 6.70E-08
GO:0016747 105 3 6.08 6.00E-07
GO:0016757 281 13 16.28 1.50E-05
GO:0016408 15 0 0.87 5.00E-05
GO:0001071 547 32 31.69 7.80E-05
GO:0003700 547 32 31.69 7.80E-05
GO:0046527 89 6 5.16 0.00013
GO:0016759 25 0 1.45 0.00029
GO:0004518 112 1 6.49 0.00062
GO:0016758 209 12 12.11 0.00078

?? Do you notice 'GO:0016408' and 'GO:0016759'? The number of significant gene of these two GOIDs? is zero. In this case, why they are considered as significant? Can we simply remove them from the list?

? A second question is that : there are many combinations of 'algorithm' and test 'statistics' and the results are not the same and maybe huge different. Which method should we chose for the analysis?

datong wang
        [[alternative HTML version deleted]]

------------------------------

Message: 31
Date: Wed, 7 Aug 2013 08:00:24 +0000
From: "Cook, Malcolm" <MEC at stowers.org>
To: Tengfei Yin <yintengfei at gmail.com>, Michael Lawrence
        <lawrence.michael at gene.com>
Cc: "bioconductor at r-project.org" <bioconductor at r-project.org>
Subject: Re: [BioC] ggbio facet_gr example sought
Message-ID: <D4772401B9D976478C0895769BE3E792BC1211 at MBSRV02.sgc.loc>
Content-Type: text/plain

Tengfei,

Thanks for acknowledging this issue.  I played around with this for a while more after Michael's comment, but could find no syntax that did not generate an error.  I look forward to learning about this capability once you have it working as you intend.

Cheers,

~ malcolm_cook at stowers.org
________________________________
From: Tengfei Yin [yintengfei at gmail.com]
Sent: Tuesday, August 06, 2013 10:17 PM
To: Michael Lawrence
Cc: Cook, Malcolm; bioconductor at r-project.org
Subject: Re: ggbio facet_gr example sought

Hi Michael and Malcolm,

Sorry for the late reply, that's actually a bug in ggbio now. I did deprecate facet_gr() function, but the feature is kept, it is supposed (used) to work when you pass GRanges to arguments 'facets', either in autoplot() or the low level API.  Some of my updates broke this feature. Thanks for pointing this out.

I will work on fixing this bug.

Tengfei

Tengfei

On Tue, Aug 6, 2013 at 4:13 PM, Michael Lawrence <lawrence.michael at gene.com<mailto:lawrence.michael at gene.com>> wrote:
I'm pretty sure that you can just pass a GRanges to the facets argument, but I haven't tried it.

On Sun, Aug 4, 2013 at 4:50 PM, Cook, Malcolm <MEC at stowers.org<mailto:MEC at stowers.org>> wrote:
Hi,
I am unable to find any examples of facet_gr argument to autoplot.

It is mentioned in http://bioconductor.org/packages/2.12/bioc/manuals/ggbio/man/ggbio.pdf on page 9 as:

Sometime, we need to view different regions, so we also have a facet_gr argument which
accept a GRanges. If this is provided, it will override the default seqnames and use provided
region to facet the graphics, this might be useful for different gene centric views.

However there is no further example of its use, and it does not appear in the list of formals, and it does not appear at all in http://bioconductor.org/packages/2.12/bioc/vignettes/ggbio/inst/doc/ggbio.pdf

And the only google hits suggest this feature is out deprecated.

Am I missing something?

Is there a contemporary equivalent?  Is there some way to facet on genomic range?  Any examples out there?

Thanks,

~ malcolm_cook at stowers.org<mailto:malcolm_cook at stowers.org>

        [[alternative HTML version deleted]]

------------------------------

Message: 32
Date: Wed, 7 Aug 2013 14:56:16 +0530
From: ALok <foralok at gmail.com>
To: BioC <bioconductor at stat.math.ethz.ch>
Subject: [BioC] basic query to make groups ..
Message-ID:
        <CA+rRKy3nsV+pDFj=yTu6CN2hDKrAWgZSUPmZPnTGsQVQdjdVfg at mail.gmail.com>
Content-Type: text/plain

Hi All,
Sorry for one basic question. My vector contains two class of group
elements,  shown in different font and color
x=c(*1.1, 1.2, 1.3*, 2.1, 2.2)
I am trying to group the objects into two classes, as *1.1, 1.2, 1.3* and
2.1, 2.2

I can use command
grep("^1.[12345678910]",x)
[1] 1 2 3

> grep("^2.[12345678910]",x)
[1] 4 5

but I am not able to automate it, using some variable k.
I have many such cases, so I want to write a iterative loop for the
structure using some variable k,
for test case, lets assume

k=1
grep("^k.[12345678910]",x)
integer(0)

Thanks in advance.

Alok

--
************************************************************
Alok Kumar Srivastava
Assistant Professor
CRRao Advanced Institute of Mathematics, Statistics and Computer Science
(AIMSCS)
Gachibowli, Hyderabad 500046.
************************************************************

        [[alternative HTML version deleted]]

------------------------------

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor

End of Bioconductor Digest, Vol 126, Issue 7
********************************************

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network.