[BioC] R: R: R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db

Manca Marco (PATH) m.manca at maastrichtuniversity.nl
Wed Nov 24 22:57:09 CET 2010



Dear Ina,

maybe other fellow BioConductors will correct me, but I would say that it is not completely true you HAVE TO depend on Ingenuity for the task of recovering "pathology relatedness" data.

One way in BioConductor could be to make use of the information available within the annotation packages through links to OMIM. By inputting the line

> ls("package:YourAnnotationPackageHere.db"

you will receive a list of pockets of your annotation package among which YourAnnotationPackageHereOMIM

which is described as (I'm copying and pasting the description of hgu133plus2OMIM as an example):

"Each manufacturer identifier is mapped to a vector of OMIM identifiers. The vector length may be one or longer, depending on how many OMIM identifiers the manufacturer identifier maps to. An NA is reported for any manufacturer identifier that cannot be mapped to an OMIM identifier at this time.

OMIM is based upon the book Mendelian Inheritance in Man (V. A. McKusick) and focuses primarily on inherited or heritable genetic diseases. It contains textual information, pictures, and reference information that can be searched using various terms, among which the MIM number is one.

Mappings were based on data provided by: Entrez Gene ftp://ftp.ncbi.nlm.nih.gov/gene/DATA With a date stamp from the source of: 2010-Mar1 "

OMIM nowadays interpretes " inherited or heritable genetic diseases" in the largest meaning as to include also susceptibilities &Co.

I would dare saying that then you can treat this identifiers for your queries and any further enrichment analysis you have in mind...

I hope this helps.

Best regards, Marco.


--
Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)

Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25, 6229  HX Maastricht

E-mail: m.manca at maastrichtuniversity.nl
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka


*********************************************************************************************************************

This email and any files transmitted with it are confidential and solely for the use of the intended recipient.

It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for

delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.

If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA

*********************************************************************************************************************
________________________________________
Da: Ina Hoeschele [inah at vbi.vt.edu]
Inviato: mercoledì 24 novembre 2010 21.46
A: Manca Marco (PATH)
Cc: bioconductor at stat.math.ethz.ch; Sean Davis
Oggetto: Re: [BioC] R: R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db

yes - thank you, Marco, for expanding on my question, and thank you, Sean, for your prompt answer (unfortunately what I expected, so it seems I/we cannot afford giving up on Ingenuity for now ...)
Ina

----- Original Message -----
From: "Manca Marco (PATH)" manca at maastrichtuniversity.nl>
To: "Sean Davis" nih.gov>
Cc: bioconductor at stat.math.ethz.ch
Sent: Wednesday, November 24, 2010 3:35:10 PM
Subject: [BioC] R: R: Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db



Hi Sean,

thank you for your insightful reply.

Best regards, Marco

--
Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)

Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25, 6229  HX Maastricht

E-mail: m.manca at maastrichtuniversity.nl
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka


*********************************************************************************************************************

This email and any files transmitted with it are confidential and solely for the use of the intended recipient.

It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for

delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.

If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA

*********************************************************************************************************************
________________________________________
Da: seandavi at gmail.com [seandavi at gmail.com] per conto di Sean Davis [sdavis2 at mail.nih.gov]
Inviato: mercoledì 24 novembre 2010 21.27
A: Manca Marco (PATH)
Cc: bioconductor at stat.math.ethz.ch
Oggetto: Re: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db

On Wed, Nov 24, 2010 at 1:48 PM, Manca Marco (PATH) manca at maastrichtuniversity.nl<mailto:m.manca at maastrichtuniversity.nl>> wrote:

Dear Sean,

thank you for your note on the difference between Ingenuity Pathway analysis and GO/KEGG analysis.

I would hope to receive also a feedback to my question, which was rather different from that of the lady you answered to: concerning the somewhat more "raw" annotation of probes (LocusLink/EtrezIDs, UniProt, etc) I am having a hard time with, how would you explain (if you have any insight at all of course) the difference in the number of annotated probes according to Bioconductor (around 27000 either by illuminaHumanv3BeadID.db or by lumiHumanAll.db) or Ingenuity (around 47000-48000 according to my colleague's experience), at least for the Illumina human ht12 v3 beadchip?


Hi, Manca.

The annotation process is pretty straightforward for Bioconductor (although one of the Seattle folks or Pan may correct my comments slightly).  The sequence identifiers (typically RefSeq or GenBank identifiers) from the manufacturer are mapped via Entrez Gene resources to an Entrez Gene Identifier.  The other information available in the annotation packages is then derived from the Entrez Gene Id.  If a sequence identifier does not map to an Entrez Gene ID, then there will not be further information available.  In this particular case, there are many EST sequences represented on the array.  If you take the Genbank accession of one of those probes that does not have further information (you can get these from the xxxxACCNUM object in the annotation packages) and put it into the NCBI Entrez website, you will likely see that it does not map to an Entrez Gene.  No rich annotation information will be obtained in this case.

I do not know how IPA does its annotation, but it could be significantly different from the one described for Bioconductor.  Perhaps someone with more experience with IPA will be able to help here.  Alternatively, you may contact the IPA technical support to see how it is done.

And most important, how is an investigator supposed to decide which tool should be used? Of course I could test them and then go to the bench to verify my results, but this would increase my expenses and would delay my publications... and most likely wouldn't give me a general principle to rely on in this situation the next time I have to perform a microarray analysis...


Unfortunately, what is a good tool (in the sense that it teaches you something of biological importance) in one situation might not be a good one for the next.  And, of course, you may need to verify results at the bench.  And as for "general principles" to rely on, you will need to evaluate the tools and methods you use in the context of the experiment--there is not always one right answer.  Finally, often two tools that purport to perform the same function are actually doing two very different things in practice, making a priori evaluation of effectiveness difficult.

Hope this helps a bit.

Sean


Thank you  in advance for your attention. best regards, Marco

--
Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)

Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25, 6229  HX Maastricht

E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at maastrichtuniversity.nl>
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka


*********************************************************************************************************************

This email and any files transmitted with it are confidential and solely for the use of the intended recipient.

It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for

delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.

If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA

*********************************************************************************************************************

________________________________________
Da: Manca Marco (PATH)
Inviato: mercoledì 24 novembre 2010 17.05
A: J.Oosting at lumc.nl<mailto:J.Oosting at lumc.nl>; bioconductor at stat.math.ethz.ch<mailto:bioconductor at stat.math.ethz.ch>
Cc: mark.dunning at gmail.com<mailto:mark.dunning at gmail.com>
Oggetto: R: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db

Dear Jan,

thank you for your prompt reply.

Can I blatantly ask why are these probes included in the chip if so?

Yet I have been comparing the results I obtain in BioConductor with those of a colleague who is analyzing data obtained with the same chip but using Ingenuity Pathway Analysis and she is apparently missing only a few hundreds annotation rather than my tens of thousands... Is Ingenuity doing something wrong here (like attributing annotations based on imperfect alignments?) or shall I abide to those results and leave R/BioConductor for these datasets?

Thank you in advance for any insight you will share with me.

Best regards, Marco

--
Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)

Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25, 6229  HX Maastricht

E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at maastrichtuniversity.nl>
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka


*********************************************************************************************************************

This email and any files transmitted with it are confidential and solely for the use of the intended recipient.

It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for

delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.

If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA

*********************************************************************************************************************

________________________________________
Da: J.Oosting at lumc.nl<mailto:J.Oosting at lumc.nl> [J.Oosting at lumc.nl<mailto:J.Oosting at lumc.nl>]
Inviato: mercoledì 24 novembre 2010 16.47
A: Manca Marco (PATH); bioconductor at stat.math.ethz.ch<mailto:bioconductor at stat.math.ethz.ch>
Oggetto: RE: [BioC] Illumina human ht12 v3 beadchip and illuminaHumanv3BeadID.db

This is normal for this chiptype. The HT12 contains a lot of probes that
have no proper annotation. These are mostly ESTs that have been
submitted to Genbank at some time, but have never been properly
attributed to any gene.

For most types of analysis these extra probes are basically worthless,
so I usually exclude them before the statistical analysis.

Jan


> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch<mailto:bioconductor-bounces at stat.math.ethz.ch> [mailto:bioconductor-<mailto:bioconductor->
> bounces at stat.math.ethz.ch<mailto:bounces at stat.math.ethz.ch>] On Behalf Of Manca Marco (PATH)
> Sent: woensdag 24 november 2010 16:01
> To: bioconductor mailing list
> Subject: [BioC] Illumina human ht12 v3 beadchip and
> illuminaHumanv3BeadID.db
> Importance: High
>
>
>
> Dearest BioConductors,
>
> good afternoon.
>
> I request your assistance on an issue about which I have found a few
old
> posts but I can't manage to find the solution.
>
> I am analyzing an experiment performed by use of Illumina's "human
ht12 v3
> beadchip" and I am trying now perform some GO and Pathway analysis to
make
> sense of the results I have obtained.
>
> The package "of choice" for annotating this chip should be
> illuminaHumanv3BeadID.db (but I have similar results with
lumiHumanAll.db
> after converting Illumina probes' IDs to NuIDs): the chip has
apparently
> 48803 probes, while I can obtain annotations for roughly 27500 of them
>
> > qcdata = capture.output(illuminaHumanv3BeadID())
> > head(qcdata, 35)
>  [1] "Quality control information for illuminaHumanv3BeadID:"
>  [2] ""
>  [3] ""
>  [4] "This package has the following mappings:"
>  [5] ""
>  [6] "illuminaHumanv3BeadIDACCNUM has 27570 mapped keys (of 27570
keys)"
>  [7] "illuminaHumanv3BeadIDALIAS2PROBE has 67274 mapped keys (of
109070
> keys)"
>  [8] "illuminaHumanv3BeadIDCHR has 25726 mapped keys (of 27570 keys)"
>  [9] "illuminaHumanv3BeadIDCHRLENGTHS has 25 mapped keys (of 25 keys)"
> [10] "illuminaHumanv3BeadIDCHRLOC has 24916 mapped keys (of 27570
keys)"
> [11] "illuminaHumanv3BeadIDCHRLOCEND has 24916 mapped keys (of 27570
> keys)"
> [12] "illuminaHumanv3BeadIDENSEMBL has 24681 mapped keys (of 27570
keys)"
> [13] "illuminaHumanv3BeadIDENSEMBL2PROBE has 17009 mapped keys (of
19892
> keys)"
> [14] "illuminaHumanv3BeadIDENTREZID has 25726 mapped keys (of 27570
keys)"
> [15] "illuminaHumanv3BeadIDENZYME has 2890 mapped keys (of 27570
keys)"
> [16] "illuminaHumanv3BeadIDENZYME2PROBE has 857 mapped keys (of 901
keys)"
> [17] "illuminaHumanv3BeadIDGENENAME has 25726 mapped keys (of 27570
keys)"
> [18] "illuminaHumanv3BeadIDGO has 22807 mapped keys (of 27570 keys)"
> [19] "illuminaHumanv3BeadIDGO2ALLPROBES has 11021 mapped keys (of
11236
> keys)"
> [20] "illuminaHumanv3BeadIDGO2PROBE has 8010 mapped keys (of 8245
keys)"
> [21] "illuminaHumanv3BeadIDMAP has 25589 mapped keys (of 27570 keys)"
> [22] "illuminaHumanv3BeadIDOMIM has 17688 mapped keys (of 27570 keys)"
> [23] "illuminaHumanv3BeadIDPATH has 7029 mapped keys (of 27570 keys)"
> [24] "illuminaHumanv3BeadIDPATH2PROBE has 220 mapped keys (of 220
keys)"
> [25] "illuminaHumanv3BeadIDPFAM has 25262 mapped keys (of 27570 keys)"
> [26] "illuminaHumanv3BeadIDPMID has 25101 mapped keys (of 27570 keys)"
> [27] "illuminaHumanv3BeadIDPMID2PROBE has 231447 mapped keys (of
248847
> keys)"
> [28] "illuminaHumanv3BeadIDPROSITE has 25262 mapped keys (of 27570
keys)"
> [29] "illuminaHumanv3BeadIDREFSEQ has 25726 mapped keys (of 27570
keys)"
> [30] "illuminaHumanv3BeadIDSYMBOL has 25726 mapped keys (of 27570
keys)"
> [31] "illuminaHumanv3BeadIDUNIGENE has 25256 mapped keys (of 27570
keys)"
> [32] "illuminaHumanv3BeadIDUNIPROT has 24730 mapped keys (of 27570
keys)"
> [33] ""
> [34] ""
> [35] "Additional Information about this package:"
>
> My sessionInfo() is as follows:
>
> > sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
>  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>  [7] LC_PAPER=en_US.utf8       LC_NAME=C
>  [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>  [1] SPIA_1.4.0                     RCurl_1.4-3
>  [3] bitops_1.0-4.1                 annaffy_1.18.0
>  [5] KEGG.db_2.3.5                  GO.db_2.3.5
>  [7] limma_3.2.3                    illuminaHumanv3BeadID.db_1.4.1
>  [9] org.Hs.eg.db_2.3.6             lumi_1.12.4
> [11] MASS_7.3-7                     RSQLite_0.9-2
> [13] DBI_0.2-5                      preprocessCore_1.8.0
> [15] mgcv_1.6-2                     affy_1.24.2
> [17] annotate_1.24.1                AnnotationDbi_1.8.2
> [19] Biobase_2.6.1
>
> loaded via a namespace (and not attached):
> [1] affyio_1.14.0      grid_2.10.1        lattice_0.18-3
> Matrix_0.999375-44
> [5] nlme_3.1-97        tools_2.10.1       xtable_1.5-6
> >
>
>
>
>
> Thank you in advance for any hints to how to solve the problem (or to
why
> I see this discrepancy)
>
> Best regards, Marco
>
>
> --
> Marco Manca, MD
> University of Maastricht
> Faculty of Health, Medicine and Life Sciences (FHML)
> Cardiovascular Research Institute (CARIM)
>
> Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
> Visiting address: Experimental Vascular Pathology group, Dept of
Pathology
> - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25,
6229
> HX Maastricht
>
> E-mail: m.manca at maastrichtuniversity.nl<mailto:m.manca at maastrichtuniversity.nl>
> Office telephone: +31(0)433874633
> Personal mobile: +31(0)626441205
> Twitter: @markomanka
>
>
>
************************************************************************
**
> *******************************************
>
> This email and any files transmitted with it are
confide...{{dropped:15}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch<mailto:Bioconductor at stat.math.ethz.ch>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list