[BioC] Annotation Tools package

Alexandre Kuhn kuhnam at mail.nih.gov
Wed Aug 12 15:43:44 CEST 2009


Hi Brad,

A couple of comments on your code (see below)

> -----Original Message-----
> 
> Affy Probeset -> Entrez:
> 
> annotationFile <- "HG-U133_Plus_2.na29.annot.csv"
> dataDirectory <- system.file("data", package = "annotationTools")
> annotation_HGU133Plus2 <- read.csv(paste(dataDirectory, annotationFile,
> + sep = "/"), colClasses = "character")

I assume here that you have the annotation file
"HG-U133_Plus_2.na29.annot.csv" in the 'data' subdirectory of the
annotationTools installation directory. I used this code in the vignette to
load an example annotation file (stored in 'data'). You can however save the
annotation file anywhere on your file system.

Second I am not sure your annotation file loaded correctly since Affymetrix
now has a header in annotation files. To skip the header (lines preceded by
the dash sign), please use the command (assuming for instance that you work
under Windows and you saved you annotation file under "C:/Annotations")

>annotation_HGU133Plus2 <-
read.csv("C:/Annotations/HG-U133_Plus_2.na29.annot.csv ", colClasses =
"character", comment.char='#')

You can check the size of annotation_HGU133Plus2 with 

>dim(annotation_HGU133Plus2)

to make sure that you now have what you expected (that is, a data.frame of
54675 rows and 41 columns).

I changed the vignette accordingly some time ago and the change will be
incorporated in the next Bioconductor release.


Alexandre




> allPS<-annotation_HGU133Plus2[,1]
> getANNOTATION(allPS, annotation_HGU133Plus2, diagnose = FALSE,
> identifierCol = 1, annotationCol = 19, noAnnotationSymbol = NA,
> noAnnotationProvidedSymbol = "---", sep = " /// ")
> entrez <- getANNOTATION(allps, annotation_HGU133Plus2, diagnose =
> FALSE, identifierCol = 1, annotationCol = 19)
> write.matrix(entrez, file = "humanentrez.csv", sep = " ")
> 
> Entrez -> illumina:
> 
> annotationFileIll <- "HumanRef-8_V3_0_R2_11282963_Ab.csv"
> dataDirectory <- system.file("data", package = "annotationTools")
> annotation_Illumina <- read.csv(paste(dataDirectory, annotationFileIll,
> + sep = "/"), colClasses = "character")
> getANNOTATION(entrez, annotation_Illumina, diagnose = FALSE,
> identifierCol = 9, annotationCol = 14, noAnnotationSymbol = NA,
> noAnnotationProvidedSymbol = "---", sep = " /// ")
> illuminaID <- getANNOTATION(entrez, annotation_Illumina, diagnose =
> FALSE, identifierCol = 9, annotationCol = 14)
> write.matrix(IlluminaID, file = "illuminaID.csv", sep = " ")
> 
> 
> It may not have been the most perfect use of the code but it seems to
> work (we are still learning).
> 
> Thanks for your help.  If there are any suggestions you feel are
> important, please let us know.
> 
> Kind regards,
> Brad
> 
> 
> --
> Brad Ander, PhD
> M.I.N.D. Institute
> University of California at Davis
> Room 2434
> 2805 50th Street
> Sacramento, CA  95817
> 
> 
> 
> 2009/8/10 Alexandre Kuhn <alexandre.kuhn at epfl.ch>
> >
> > Hi Yingfang,
> > Once you have loaded your Affymetrix annotation into R (assume it is
> > contained in an R object named 'annot') you could for instance select
> all
> > probe sets by subsetting the data.frame so as to select the first
> column
> >
> > > allps<-annot[,1]
> >
> > I am not sure this answers your question tough. Could you please send
> some
> > lines of code to help me understand what is going wrong?
> >
> > Best, Alexandre
> >
> > -----Original Message-----
> > From: bioconductor-bounces at stat.math.ethz.ch
> > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Yingfang
> Tian
> > Sent: jeudi 6 août 2009 19:23
> > To: bioconductor at stat.math.ethz.ch
> > Cc: Brad Ander
> > Subject: Re: [BioC] Annotation Tools package
> >
> > Dear Dr Kuhn:
> >
> > We are Yingfang Tian and Brad Ander from the University of California
> at
> > Davis.  We are working on the cross-platform analysis of 3 platforms:
> > llumina Human Ref-8,  Affymetrix U133 plus 2 array, and Affymetrix
> human
> > Exon array.
> >
> > We are trying to use your annotationTools package in R amd are able
> to at
> > least translate across probes from U133 arrays to Illumina, similar
> to the
> > example you give in the BMC Bioinformatics paper.   We were wondering
> how to
> > import large or entire numbers of probesets into the “myPS” object?
>  It may
> > be a basic R command, but unfortunately we rely on commercial
> software for
> > the majority of our analyses and have limited experience with R
> (hopefully
> > that can change on both fronts).
> >
> > >From the paper, it seems that we can generate a list of Refseq IDs
> from
> > >the
> > Affy Probesets and then use this list (set it as the “myPS” object)
> to pull
> > out the Illumina Probe ID using the refseq column as the identifier
> column.
> > We can export all these with a simple write command.  Right now, we
> are
> > thinking of bridging to/from the Exon arrays with the Unigene.  I
> guess we
> > will have to see how that works
> >
> > Again, we are having success when dealing with a few probesets, but
> there
> > must be a way to get ALL probesets.  Can you please help us with
> this?
> >  Possibly
> > with the example command syntax?  In the paper you mention mapping
> all the
> > mouse probes across platforms, so you must have had to deal with
> this.
> >
> > We are likely wanting to try the cross species analysis in the near
> future
> > as well, so learning how to get passed the limit of entering each
> > probe/gene/etc manually will be a big help.
> >
> > Kind regards,
> >
> > Yingfang and Brad
> >
> > --
> > Yingfang Tian, PhD
> > M.I.N.D. Institute
> > University of California at Davis
> > 2805 50th Street,Room 2434
> > Sacramento, CA  95817
> > Tel:916-703-0384
> >
> >        [[alternative HTML version deleted]]
> >
> >



More information about the Bioconductor mailing list