[BioC] annotation package ?

Marc Carlson mcarlson at fhcrc.org
Tue Aug 23 02:04:29 CEST 2011


Oops, pasted the wrong link before.  You want this one:

http://www.bioconductor.org/packages/2.8/bioc/vignettes/AnnotationDbi/inst/doc/SQLForge.pdf


    Marc



On 08/22/2011 04:55 PM, Marc Carlson wrote:
> Hi Jing,
>
> If you need a chip package that is not presently hosted, you can 1) 
> retrieve the probe to gene mappings from the people who made the 
> platform and then 2) follow the instructions in this vignette to 
> generate a custom package:
>
> http://www.bioconductor.org/packages/2.8/bioc/vignettes/AnnotationDbi/inst/doc/SQLForge.R 
>
>
>
>   Marc
>
>
> On 08/22/2011 04:07 PM, Sean Davis wrote:
>> Hi, Jing.
>>
>> You could try:
>>
>> http://bioconductor.org/packages/release/data/annotation/html/OperonHumanV3.db.html 
>>
>>
>> Note that this might not be right, but the Operon set was in common
>> use a few years ago.
>>
>> If this isn't what you need, you know that GEOquery automatically
>> grabs the annotation data from NCBI GEO?  For example using a GSE from
>> GPL1528, see below.  You can use the AnnotationDbi package to make
>> your own annotation packages based on these annotations.  In
>> particular, for GPL1528, the Unigene IDs are included.
>>
>> Hope that helps.
>>
>> Sean
>>
>>
>>
>>> library(GEOquery)
>> Loading required package: Biobase
>>
>> Welcome to Bioconductor
>>
>>    Vignettes contain introductory material. To view, type
>>    'browseVignettes()'. To cite Bioconductor, see
>>    'citation("Biobase")' and for packages 'citation("pkgname")'.
>>
>> Setting options('download.file.method.GEOquery'='curl')
>>> gse = getGEO("GSE2020")
>> Found 1 file(s)
>> GSE2020_series_matrix.txt.gz
>> trying URL 
>> 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2020/GSE2020_series_matrix.txt.gz'
>> ftp data connection made, file length 518963 bytes
>> opened URL
>> ==================================================
>> downloaded 506 Kb
>>
>> File stored at:
>> /tmp/Rtmpdgx7wJ/GPL1528.soft
>>
>>> gse
>> $GSE2020_series_matrix.txt.gz
>> ExpressionSet (storageMode: lockedEnvironment)
>> assayData: 21794 features, 10 samples
>>    element names: exprs
>> protocolData: none
>> phenoData
>>    sampleNames: GSM36482 GSM36483 ... GSM36491 (10 total)
>>    varLabels: title geo_accession ... data_row_count (31 total)
>>    varMetadata: labelDescription
>> featureData
>>    featureNames: 1140849_1 1140850_1 ... 1298880_1 (21794 total)
>>    fvarLabels: ID MADB_WELL_ID ... SPOT_ID (8 total)
>>    fvarMetadata: Column Description labelDescription
>> experimentData: use 'experimentData(object)'
>> Annotation: GPL1528
>>
>>> head(fData(gse[[1]]))
>>                   ID MADB_WELL_ID   OLIGO_ID GENE UNIGENE
>> 1140849_1 1140849_1      1140849 SptRpt-2a1
>> 1140850_1 1140850_1      1140850 SptRpt-2a2
>> 1140851_1 1140851_1      1140851 SptRpt-2a3
>> 1140852_1 1140852_1      1140852 SptRpt-2a4
>> 1140853_1 1140853_1      1140853 SptRpt-2a5
>> 1140854_1 1140854_1      1140854 SptRpt-2a6
>>
>>           DESCRIPTION
>> 1140849_1                            Human Beta-Actin PCR Product
>> Human Beta-Actin 100ng/ul
>> 1140850_1     PCR Product 1 (Cab) A. thaliana photosystem 1
>> chlorophyll a/b-binding protein
>> 1140851_1                         PCR Product 5 (LTP6) A. thaliana
>> lipid transfer protien 6
>> 1140852_1
>>                 3XSSC
>> 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1
>> chlorophyll a/b-binding protein
>> 1140854_1                     Oligonucleotide 5 (LTP6) A. thaliana
>> lipid transfer protien 6
>>            GB_LIST
>> 1140849_1
>> 1140850_1
>> 1140851_1
>> 1140852_1
>> 1140853_1
>> 1140854_1
>>
>>               SPOT_ID
>> 1140849_1                            Human Beta-Actin PCR Product
>> Human Beta-Actin 100ng/ul
>> 1140850_1     PCR Product 1 (Cab) A. thaliana photosystem 1
>> chlorophyll a/b-binding protein
>> 1140851_1                         PCR Product 5 (LTP6) A. thaliana
>> lipid transfer protien 6
>> 1140852_1
>>                 3XSSC
>> 1140853_1 Oligonucleotide 1 (Cab) A. thaliana photosystem 1
>> chlorophyll a/b-binding protein
>> 1140854_1                     Oligonucleotide 5 (LTP6) A. thaliana
>> lipid transfer protien 6
>>
>>
>> On Mon, Aug 22, 2011 at 6:57 PM, Jing Huang<huangji at ohsu.edu>  wrote:
>>> Dear All members,
>>>
>>> I need to analyze a GEO database dataset. The data was generated 
>>> with the platform 
>>> GPL1528<http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL1528>: 
>>> NCI/ATC Hs-OperonV2. I should use hgu133plus2.db if the data was 
>>> generated by Affymetrix platform.
>>>
>>> Can somebody advise me what R annotation package I should use to 
>>> solve my problem in this case?
>>>
>>>
>>> Many Thanks
>>>
>>> Jing
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list