[BioC] Annotation Tools package

Brad Ander brad.ander.ucdavis at gmail.com
Fri Aug 14 01:54:54 CEST 2009


Thank you for the suggestions, Alexandre.

We indeed were placing the annotation files into the data directory of
annotationTools, now we now just use our working directory by leaving
out:

>dataDirectory <- system.file("data", package = "annotationTools")

You are also correct about the header.  We got around this by deleting
the header in the annotation file, but your suggestion to skip the
comments marked '#' will make it convenient to just use the files as
provided.  As we had it, all rows/columns were read properly.

Thanks for the help and the great tool.

Brad

--
Brad Ander, PhD
M.I.N.D. Institute
University of California at Davis
Room 2434
2805 50th Street
Sacramento, CA  95817




2009/8/12 Alexandre Kuhn <kuhnam at mail.nih.gov>:
> Hi Brad,
>
> A couple of comments on your code (see below)
>
>> -----Original Message-----
>>
>> Affy Probeset -> Entrez:
>>
>> annotationFile <- "HG-U133_Plus_2.na29.annot.csv"
>> dataDirectory <- system.file("data", package = "annotationTools")
>> annotation_HGU133Plus2 <- read.csv(paste(dataDirectory, annotationFile,
>> + sep = "/"), colClasses = "character")
>
> I assume here that you have the annotation file
> "HG-U133_Plus_2.na29.annot.csv" in the 'data' subdirectory of the
> annotationTools installation directory. I used this code in the vignette to
> load an example annotation file (stored in 'data'). You can however save the
> annotation file anywhere on your file system.
>
> Second I am not sure your annotation file loaded correctly since Affymetrix
> now has a header in annotation files. To skip the header (lines preceded by
> the dash sign), please use the command (assuming for instance that you work
> under Windows and you saved you annotation file under "C:/Annotations")
>
>>annotation_HGU133Plus2 <-
> read.csv("C:/Annotations/HG-U133_Plus_2.na29.annot.csv ", colClasses =
> "character", comment.char='#')
>
> You can check the size of annotation_HGU133Plus2 with
>
>>dim(annotation_HGU133Plus2)
>
> to make sure that you now have what you expected (that is, a data.frame of
> 54675 rows and 41 columns).
>
> I changed the vignette accordingly some time ago and the change will be
> incorporated in the next Bioconductor release.
>
>
> Alexandre
>
>
>
>
>> allPS<-annotation_HGU133Plus2[,1]
>> getANNOTATION(allPS, annotation_HGU133Plus2, diagnose = FALSE,
>> identifierCol = 1, annotationCol = 19, noAnnotationSymbol = NA,
>> noAnnotationProvidedSymbol = "---", sep = " /// ")
>> entrez <- getANNOTATION(allps, annotation_HGU133Plus2, diagnose =
>> FALSE, identifierCol = 1, annotationCol = 19)
>> write.matrix(entrez, file = "humanentrez.csv", sep = " ")
>>
>> Entrez -> illumina:
>>
>> annotationFileIll <- "HumanRef-8_V3_0_R2_11282963_Ab.csv"
>> dataDirectory <- system.file("data", package = "annotationTools")
>> annotation_Illumina <- read.csv(paste(dataDirectory, annotationFileIll,
>> + sep = "/"), colClasses = "character")
>> getANNOTATION(entrez, annotation_Illumina, diagnose = FALSE,
>> identifierCol = 9, annotationCol = 14, noAnnotationSymbol = NA,
>> noAnnotationProvidedSymbol = "---", sep = " /// ")
>> illuminaID <- getANNOTATION(entrez, annotation_Illumina, diagnose =
>> FALSE, identifierCol = 9, annotationCol = 14)
>> write.matrix(IlluminaID, file = "illuminaID.csv", sep = " ")
>>
>>
>> It may not have been the most perfect use of the code but it seems to
>> work (we are still learning).
>>
>> Thanks for your help.  If there are any suggestions you feel are
>> important, please let us know.
>>
>> Kind regards,
>> Brad
>>
>>
>> --
>> Brad Ander, PhD
>> M.I.N.D. Institute
>> University of California at Davis
>> Room 2434
>> 2805 50th Street
>> Sacramento, CA  95817
>>
>>
>>
>> 2009/8/10 Alexandre Kuhn <alexandre.kuhn at epfl.ch>
>> >
>> > Hi Yingfang,
>> > Once you have loaded your Affymetrix annotation into R (assume it is
>> > contained in an R object named 'annot') you could for instance select
>> all
>> > probe sets by subsetting the data.frame so as to select the first
>> column
>> >
>> > > allps<-annot[,1]
>> >
>> > I am not sure this answers your question tough. Could you please send
>> some
>> > lines of code to help me understand what is going wrong?
>> >
>> > Best, Alexandre
>> >
>> > -----Original Message-----
>> > From: bioconductor-bounces at stat.math.ethz.ch
>> > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Yingfang
>> Tian
>> > Sent: jeudi 6 août 2009 19:23
>> > To: bioconductor at stat.math.ethz.ch
>> > Cc: Brad Ander
>> > Subject: Re: [BioC] Annotation Tools package
>> >
>> > Dear Dr Kuhn:
>> >
>> > We are Yingfang Tian and Brad Ander from the University of California
>> at
>> > Davis.  We are working on the cross-platform analysis of 3 platforms:
>> > llumina Human Ref-8,  Affymetrix U133 plus 2 array, and Affymetrix
>> human
>> > Exon array.
>> >
>> > We are trying to use your annotationTools package in R amd are able
>> to at
>> > least translate across probes from U133 arrays to Illumina, similar
>> to the
>> > example you give in the BMC Bioinformatics paper.   We were wondering
>> how to
>> > import large or entire numbers of probesets into the “myPS” object?
>>  It may
>> > be a basic R command, but unfortunately we rely on commercial
>> software for
>> > the majority of our analyses and have limited experience with R
>> (hopefully
>> > that can change on both fronts).
>> >
>> > >From the paper, it seems that we can generate a list of Refseq IDs
>> from
>> > >the
>> > Affy Probesets and then use this list (set it as the “myPS” object)
>> to pull
>> > out the Illumina Probe ID using the refseq column as the identifier
>> column.
>> > We can export all these with a simple write command.  Right now, we
>> are
>> > thinking of bridging to/from the Exon arrays with the Unigene.  I
>> guess we
>> > will have to see how that works
>> >
>> > Again, we are having success when dealing with a few probesets, but
>> there
>> > must be a way to get ALL probesets.  Can you please help us with
>> this?
>> >  Possibly
>> > with the example command syntax?  In the paper you mention mapping
>> all the
>> > mouse probes across platforms, so you must have had to deal with
>> this.
>> >
>> > We are likely wanting to try the cross species analysis in the near
>> future
>> > as well, so learning how to get passed the limit of entering each
>> > probe/gene/etc manually will be a big help.
>> >
>> > Kind regards,
>> >
>> > Yingfang and Brad
>> >
>> > --
>> > Yingfang Tian, PhD
>> > M.I.N.D. Institute
>> > University of California at Davis
>> > 2805 50th Street,Room 2434
>> > Sacramento, CA  95817
>> > Tel:916-703-0384
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> >
>
>



More information about the Bioconductor mailing list