[BioC] problem with rat database

Alberto Goldoni alberto.goldoni1975 at gmail.com
Tue May 10 14:34:26 CEST 2011


@Davis

You are right! But i have tryed to perform this kind of search:

library("rgug4130a.db")
x <- rgug4130aENSEMBL
mapped_genes <- mappedkeys(x)
xx <- as.list(x[mapped_genes])

or this approach:

x <- rgug4130aGENENAME
mapped_probes <- mappedkeys(x)
xx <- as.list(x[mapped_probes])

but the results are the same in some genes there is:"unknown function".

I would like to know if there is a method in order to perform the
search using another database or directly to the Rat Genome Database
or using biomaRt...but i don't know how.
I have more or less 100 genes with an "unknown function" and it would
be very useful if there is a script or function in order to perform
automatically instead of serching genes one by one.


Best regards.

2011/5/10 Sean Davis <sdavis2 at mail.nih.gov>:
>
>
> On Tue, May 10, 2011 at 8:17 AM, Alberto Goldoni
> <alberto.goldoni1975 at gmail.com> wrote:
>>
>> @Vincent
>>
>> The chip used is the "rgug4130a" so i have to use the "rgug4130a.db"
>> database.
>>
>> In order to obtain the toptable this is my history:
>>
>> library(limma)
>> library(vsn)
>> targets <- readTargets("targets.txt")
>> RG <- read.maimages(targets$FileName, source="agilent")
>> MA <- normalizeBetweenArrays(RG, method="Aquantile")
>> contrast.matrix <-
>>
>> cbind("(hda+str)-(ref)"=c(1,0),"(ref+str)-(ref)"=c(0,1),"(hda+str)-(ref+str)"=c(1,-1))
>> rownames(contrast.matrix) <- colnames(design)
>> fit <- lmFit(MA, design)
>> fit2 <- contrasts.fit(fit, contrast.matrix)
>> fit2 <- eBayes(fit2)
>> geni500<-topTable(fit2,number=500,adjust="BH")
>>
>
> Hi, Alberto.
> The data in your topTable result are taken from the feature extraction
> result file.  In other words, rgug4130a.db is not used in what you show
> above.  You could add to your annotation using either rgug4130a.db or
> biomaRt, but you will need to perform these steps yourself.  As to why some
> of your probes do not appear to have annotation, you would probably need to
> contact Agilent as they are the source of your current annotation.
> Hope that helps,
> Sean
>
>>
>> > sessionInfo()
>> R version 2.12.1 (2010-12-16)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>> Kingdom.1252
>> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_United Kingdom.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] AnnotationDbi_1.12.0 Biobase_2.10.0       limma_3.6.9
>>
>> loaded via a namespace (and not attached):
>> [1] DBI_0.2-5     RSQLite_0.9-4 tools_2.12.1
>>
>>
>>
>> 2011/5/10 Vincent Carey <stvjc at channing.harvard.edu>:
>> > 1) you did not provide sessionInfo(), which is critical for helping
>> > you to diagnose an issue that may pertain to software version --
>> > revisions to annotation packages can have all sorts of consequences
>> >
>> > 2) i am not sure rgug4130.db has anything to do with this.
>> >
>> >> get("CB606456", revmap(rgug4130aSYMBOL))
>> > Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>> >  value for "CB606456" not found
>> >
>> >
>> > and so on.  look at the featureData component of the object passed to
>> > lmFit -- the annotation may be in there.  if this does not give
>> > clarification please give very explicity indication of how the
>> > topTable was generated, going back to the structure of the object
>> > passed to lmFit
>> >
>> > On Tue, May 10, 2011 at 5:30 AM, Alberto Goldoni
>> > <alberto.goldoni1975 at gmail.com> wrote:
>> >> Dear All,
>> >> i'm analyzing agilent microarrays with the "rgug4130a.db" database and
>> >> using the function:"topTable(fit2,number=500,adjust="BH")" i have
>> >> obtained 500 genes like these:
>> >>
>> >> Row     Col     ProbeUID        ControlType     ProbeName
>> >> GeneName        SystematicName  Description     X.hda.str...ref.
>> >>  X.ref.str...ref.        X.hda.str...ref.str.    AveExpr F       P.Value
>> >> adj.P.Val
>> >> 16096   79      38      15309   0       A_43_P10328     CB606456
>> >>  CB606456        unknown
>> >> function        3.988290607     -0.951656306    4.939946913
>> >> 10.29735936     36.77263264     0.000212298     0.641094595
>> >> 8109    40      109     7609    0       A_42_P552092    203358_Rn
>> >> 203358_Rn       Rat c-fos
>> >> mRNA.   5.670956889     4.413365374        1.257591514     13.47699544
>> >>     33.20342601     0.000292278     0.641094595
>> >>
>> >> but as you can see most genes like the first one  - CB606456 -  in the
>> >> DESCRPTION there is written "unknown function".
>> >>
>> >> So i have performed a very simply search.
>> >> 1) First in ENSAMBLE using the GeneName "CB606456" with the "Locations
>> >> of DnaAlignFeature" it gives to me the Genomic location(strand): chr
>> >> 7:16261621-16262210
>> >> 2) Then in the Rat Genome Database
>> >> (http://rgd.mcw.edu/tools/genes/genes_view.cgi?id=735058) i have found
>> >> that in this position there is one gene:
>> >>
>> >> 735058  GENE    Angptl4 angiopoietin-like 4     7       16261623
>> >>  16267852
>> >>
>> >> so the question is why in the "rgug4130a.db" database the R system
>> >> gives to me "unknown function" when using the genomic location in
>> >> ensamble and then in rgd it gives to me the Angptl4 gene!
>> >>
>> >> and there is a function in order to do to R to perform this kind of
>> >> search automatically? (this why in my 500 genes there are 100 "unknow
>> >> function" genes and it will be interesting to have a function that
>> >> perform this kind of search automatically).
>> >>
>> >>
>> >> Best regards to all and to whom answer to me.
>> >>
>> >> --
>> >> -----------------------------------------------------
>> >> Dr. Alberto Goldoni
>> >> Parma, Italy
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Dr. Alberto Goldoni
>> Parma, Italy
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



-- 
-----------------------------------------------------
Dr. Alberto Goldoni
Parma, Italy



More information about the Bioconductor mailing list