[BioC] problem about hgu133plus2 annotation

James W. MacDonald jmacdon at med.umich.edu
Thu Jul 22 18:41:42 CEST 2010


Hi Gina,

On 7/22/2010 5:11 AM, Gina Liao wrote:
>
> Dear All,
> I have 20 chips, and I used R to standardize the CEL files.Then, i got an expression value data of all chips.And I also downloaded the annotation csv format from NetAffy.(HG-U133_Plus_2 Annotations, CSV format, Release 30 (22 MB, 11/15/09))
> Here's my code.
> ########test = justRMA()eset.st = standardise(test)
> exprs.st = exprs(eset.st)e.out = exprs.stdim(e.out) #* 54675 20########
> However, i found out that the order of the rownames(e.out) is a little different to the row name of hgu133plus2.csv. The order from 54630 to 54640 is not the same to these two rows.
> They should be the same,right? Is "hgu133plus2cdf" the problem? How could I solve it?

I would recommend you use the annotation packages that are available 
from Bioconductor rather than downloading the annotation packages from 
Affymetrix. The BioC annotation packages contain the same information, 
but are designed to be easily used from within R, and you will find the 
.csv files you can get from Affy are not as user-friendly.

You can get the annotation package using biocLite():

biocLite("hgu133plus2.db")

Note that there is no reason to expect that the order of annotation data 
will be the same as the order of expression data. Re-ordering things is 
exceedingly simple in R, so this point is irrelevant.

Using the annotation packages will take some reading on your part, but 
once you get the hang of things, I think you will like how they work. 
You might start with

library(hgu133plus2.db)
?hgu133plus2.db

as well as

openVignette() and choose the AnnotationDbi vignette.

If you are interested in annotating the set of interesting genes from 
your experiment, you will want to look at the annaffy package, which 
will allow you to output both HTML and text files with your results and 
annotations for each gene.

In addition, you might want to look at the affycoretools package, which 
helps automate some of the steps required to annotate results. This 
package is also integrated with limma, so you can go straight from your 
linear model fits to output in one function call.

Best,

Jim



> Thanks!!!!!
> Best,Gina 		 	   		
> _________________________________________________________________
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list