[BioC] FW: PLANdbAffy + Alternative Exon Annotation +XPS, aroma, oligo, RMAExpress

cstrato cstrato at aon.at
Tue Dec 7 23:07:00 CET 2010


Dear Ramil,

Please let me mention that handling 19 HuExon arrays on your notebook 
using one of the Bioconductor packages should not be a problem as you 
can see on the Bioconductor workflows site:
http://www.bioconductor.org/help/workflows/oligo-arrays/#pre-processing-resources
It says for example that xps "will run on conventional desktop computers".

Regarding the format for annotation:
Package xps requires the annotation.csv file format from Affymetrix.
Other BioC packages usually require one of the metadata packages found in:
http://www.bioconductor.org/help/bioc-views/release/data/annotation/
To my knowledge these metadata are usually created by package AnnotationDbi.

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._


On 12/7/10 4:21 PM, Ramil Nurtdinov wrote:
> Dear colleagues
>
> My experience with R BioConductor and Affymetrix Human Exon 1.0 ST
> array started from oligo package. Unfortunately for my 19 HuExon1.0
> arrays R asks for approx 6-7 Gigabytes of  memory. While RMA algorithm
> in Affymetrix Expression Console takes 40 minutes of my Sony Vailo
> notebook. Second there was no good annotation for this chip in R,
> except X:Map, my competitor for the paper :))
>
> So first problem I had solved by Expression Console and for second
> problem we had developed PLANdbAffy
> http://nar.oxfordjournals.org/content/38/suppl_1/D726.long
>
> Now I am finishing EnsEmbl plus hg19 version of database. I understand
> that BioConductor is
> widely used in scientific word but my load is rather big because of
> many new projects.
>
> If somebody gives me the format for annotation I can make
> corresponding database summary file.
>
> Yours sincerely,
> Ramil Nurtdinov, PhD
>
> .On 12/7/10, B.Misovic at lumc.nl<B.Misovic at lumc.nl>  wrote:
>> Dear Ramil,
>>
>>
>>
>>    I see I forgot to add you in the email  bellow   which I've sent to
>> bioConductor mailing list and our collaborators  in Poland... just in
>> case you have some comments.
>>
>>
>>
>> Best,
>>
>> Branko
>>
>>
>>
>> ________________________________
>>
>> From: Misovic, B. (TOXGEN)
>> Sent: 07 December 2010 15:09
>> To: 'roman.jaksik at polsl.pl'; 'bioconductor at r-project.org'
>> Cc: 'cstrato'
>> Subject: PLANdbAffy + Alternative Exon Annotation
>> +XPS,aroma,oligo,RMAExpress
>>
>>
>>
>> Dear Roman, all
>>
>>
>>
>>    Recently we tried your version of Annotation files for Gene 1.0 ST
>> array that your team built from PLANdbAffy DB . I encountered some
>> problems so I hope you can help.
>>
>>
>>
>> You provide nice CDF and Affy PGF/CLF files , but, the PGF/CLF were not
>> useful in  bioConductor packages for affy Exon/Gene type arrays ,namely:
>> oligo&  XPS as they require annotation file in csv format. I tried the
>> annotation csv file from Affymetrix and after that from PLANdbAffy DB.
>> The PLANdbAffy  csv file is very different from Affymetrix one so import
>> is not possible (actually csv file on the website is TAB delimited
>> instead of comma so problem already starts there , and  it requires
>> reformatting).
>>
>> Christian from XPS was kind to inform me that :
>>
>>
>>> ... PLANdbAffy annotation  columns have nothing to do with the
>> Affymetrix
>>> annotation columns. Thus xps will not read these annotation files.
>>
>>> Alternative annotation files must contain exactly the same columns as
>>
>>> the Affymetrix annotation files.
>>
>>
>>
>>> For whole genome and exon arrays it is not possible to use only the
>> PGF->files w/o the annotation files, since I extract most of the
>> important>information from the probeset-annotation file first, so this
>> file is>absolutely essential. For example, column "level" contains the
>> information>Core/Extended/Full, see the corresponding annotation README
>> files for an>explanation of all columns.
>>
>>
>>
>>> xps  error you get simply says that their PGF-file does not contain the
>>> AFFX controls, so maybe adding the AFFX controls to their PGF-file
>> might>help. However, as you mention, they use their own Probesetids,
>> which will>not match the Probesetids of the Affymetrix annotation
>> files, thus it may>not work anyhow.
>>
>>
>>
>>> It is not quite clear to me why they created their own PGF-file. The
>>> Affymetrix PGF-file contains only 1-4 probes for each probeset, where
>> each>exon consists of one or more probesets, thus the probability that
>> a probe>within a probeset is not correct should be pretty small.
>> However, a>probeset could be mapped to a wrong exon/gene or no gene at
>> all, so it>should be sufficient to correct the Affymetrix annotation
>> files.
>>
>>
>>
>>     The tools like RMAExpress, EC., and Aroma.affymetrix, can work with
>> CDF only. So after using RMAExpress (in command line mode)  I did get
>> Expression matrix out but I could not link 19532 Probeset ids to
>> PLANdbAffy annotation csv file to collect gene basic information. What i
>> did was , 1st load the full annotation file (not filtered) from
>> PLANdbAffy:
>> http://affymetrix2.bioinf.fbb.msu.ru/files.html
>>
>> and search the 2nd colum (Probe_Sets) with ids after RMA and I find 0...
>> then i tried the 1st column (the Probes ) and found  8664... but I would
>> expect vice versa situation ?
>>
>>
>>
>> So Roman can you please:
>> 1) advise how to get real ids after RMAExpress run?
>> 2) do you plan to build Annotation csv file as Affymetrix dose so that
>> other software from Bioconductor oligo&  XPS can use it?
>> 3) comment on Christian feedback.
>>
>>
>>
>> Btw. Christian, how come RMAExpress, EC., and Aroma.affymetrix can work
>> with CDFs only  and oligo&  XPS require extra annotation? From what  I
>> gather (after peaking into CDF and PGF files ) they show what probes are
>> belonging to probe_set. So for probe_set level analysis (or more
>> exon_like analysis) the PGF/CLF files alone seem to be enough?
>>
>>
>>
>> For bioc list, just to bring attention to this article&  DB :
>>
>>
>>
>> PLANdbAffy: probe-level annotation database for Affymetrix expression
>> microarrays , Ramil N. Nurtdinov1 et al.
>>
>> http://nar.oxfordjournals.org/content/38/suppl_1/D726.full
>>
>>
>>
>> http://affymetrix2.bioinf.fbb.msu.ru/
>>
>>
>>
>> Maybe some of bioC experts have comments about it?
>>
>>
>>
>> Best,
>>
>> Branko
>>
>>
>>
>> --------------------------
>>
>> Branislav Misovic,
>>
>> Department of Toxicogenetics
>>
>> Leiden University Medical Center
>>
>> Einthovenweg 20, 2333 ZC Leiden
>>
>> PO.box 9600, Building2,Room:T3-11
>>
>> 2300 RC Leiden
>>
>> The Netherlands
>>
>> Phone: +31 71 526 9636
>>
>> Mob: 0653135855
>>
>> E-mail:
>>
>> b.misovic at lumc.nl
>>
>> braniti at gmail.com
>>
>>
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list