[BioC] annotation package ?

Marc Carlson mcarlson at fhcrc.org
Tue Aug 23 01:55:57 CEST 2011


Hi Jing,

If you need a chip package that is not presently hosted, you can 1) 
retrieve the probe to gene mappings from the people who made the 
platform and then 2) follow the instructions in this vignette to 
generate a custom package:

http://www.bioconductor.org/packages/2.8/bioc/vignettes/AnnotationDbi/inst/doc/SQLForge.R


   Marc


On 08/22/2011 04:07 PM, Sean Davis wrote:
> Hi, Jing.
>
> You could try:
>
> http://bioconductor.org/packages/release/data/annotation/html/OperonHumanV3.db.html
>
> Note that this might not be right, but the Operon set was in common
> use a few years ago.
>
> If this isn't what you need, you know that GEOquery automatically
> grabs the annotation data from NCBI GEO?  For example using a GSE from
> GPL1528, see below.  You can use the AnnotationDbi package to make
> your own annotation packages based on these annotations.  In
> particular, for GPL1528, the Unigene IDs are included.
>
> Hope that helps.
>
> Sean
>
>
>
>> library(GEOquery)
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>    Vignettes contain introductory material. To view, type
>    'browseVignettes()'. To cite Bioconductor, see
>    'citation("Biobase")' and for packages 'citation("pkgname")'.
>
> Setting options('download.file.method.GEOquery'='curl')
>> gse = getGEO("GSE2020")
> Found 1 file(s)
> GSE2020_series_matrix.txt.gz
> trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020/GSE2020_series_matrix.txt.gz'
> ftp data connection made, file length 518963 bytes
> opened URL
> ==================================================
> downloaded 506 Kb
>
> File stored at:
> /tmp/Rtmpdgx7wJ/GPL1528.soft
>
>> gse
> $GSE2020_series_matrix.txt.gz
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 21794 features, 10 samples
>    element names: exprs
> protocolData: none
> phenoData
>    sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total)
>    varLabels: title geo_accession ... data_row_count (31 total)
>    varMetadata: labelDescription
> featureData
>    featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total)
>    fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total)
>    fvarMetadata: Column Description labelDescription
> experimentData: use 'experimentData(object)'
> Annotation: GPL1528
>
>> head(fData(gse[[1]]))
>                   ID MADB_WELL_ID   OLIGO_ID GENE UNIGENE
> 1140849_1 1140849_1      1140849 SptRpt-2a1
> 1140850_1 1140850_1      1140850 SptRpt-2a2
> 1140851_1 1140851_1      1140851 SptRpt-2a3
> 1140852_1 1140852_1      1140852 SptRpt-2a4
> 1140853_1 1140853_1      1140853 SptRpt-2a5
> 1140854_1 1140854_1      1140854 SptRpt-2a6
>
>           DESCRIPTION
> 1140849_1                            Human Beta-Actin PCR Product
> Human Beta-Actin 100ng/ul
> 1140850_1     PCR Product 1 (Cab) A. thaliana photosystem 1
> chlorophyll a/b-binding protein
> 1140851_1                         PCR Product 5 (LTP6) A. thaliana
> lipid transfer protien 6
> 1140852_1
>                 3XSSC
> 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1
> chlorophyll a/b-binding protein
> 1140854_1                     Oligonucleotide 5 (LTP6) A. thaliana
> lipid transfer protien 6
>            GB_LIST
> 1140849_1
> 1140850_1
> 1140851_1
> 1140852_1
> 1140853_1
> 1140854_1
>
>               SPOT_ID
> 1140849_1                            Human Beta-Actin PCR Product
> Human Beta-Actin 100ng/ul
> 1140850_1     PCR Product 1 (Cab) A. thaliana photosystem 1
> chlorophyll a/b-binding protein
> 1140851_1                         PCR Product 5 (LTP6) A. thaliana
> lipid transfer protien 6
> 1140852_1
>                 3XSSC
> 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1
> chlorophyll a/b-binding protein
> 1140854_1                     Oligonucleotide 5 (LTP6) A. thaliana
> lipid transfer protien 6
>
>
> On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang<huangji at ohsu.edu>  wrote:
>> Dear All members,
>>
>> I need to analyze a GEO database dataset. The data was generated with the platform GPL1528<http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL1528>: NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was generated by Affymetrix platform.
>>
>> Can somebody advise me what R annotation package I should use to solve my problem in this case?
>>
>>
>> Many Thanks
>>
>> Jing
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list