[BioC] Saureus.db Files

Vincent Carey stvjc at channing.harvard.edu
Wed Jan 25 17:34:16 CET 2012


Marcos, if I understand your concern, you have used an affy chip for
S. aureus and
have identified a list of probe names of some interest.  You do not
know how to resolve
the probe names to more biologically meaningful identifiers.  There is
nothing currently
in Bioconductor that solves this problem -- the chip is not widely
enough used, it seems.
However, from the netaffx resources for this chip, you can obtain a
CSV file with annotation.
It may not be necessary to build a new annotation package to solve
your problem.  Simple
programming with the annotation table may suffice.

Here are some excerpts of explorations of the csv file contents -- I used

> dd = read.csv("S_aureus.na32.annot.csv", skip=14, h=TRUE)
> dim(dd)
[1] 7775   41
> names(dd)
 [1] "Probe.Set.ID"                     "GeneChip.Array"
 [3] "Species.Scientific.Name"          "Annotation.Date"
 [5] "Sequence.Type"                    "Sequence.Source"
 [7] "Transcript.ID.Array.Design."      "Target.Description"
 [9] "Representative.Public.ID"         "Archival.UniGene.Cluster"
[11] "UniGene.ID"                       "Genome.Version"
[13] "Alignments"                       "Gene.Title"
[15] "Gene.Symbol"                      "Chromosomal.Location"
[17] "Unigene.Cluster.Type"             "Ensembl"
[19] "Entrez.Gene"                      "SwissProt"
[21] "EC"                               "OMIM"
[23] "RefSeq.Protein.ID"                "RefSeq.Transcript.ID"
[25] "FlyBase"                          "AGI"
[27] "WormBase"                         "MGI.Name"
[29] "RGD.Name"                         "SGD.accession.number"
[31] "Gene.Ontology.Biological.Process" "Gene.Ontology.Cellular.Component"
[33] "Gene.Ontology.Molecular.Function" "Pathway"
[35] "InterPro"                         "Trans.Membrane"
[37] "QTL"                              "Annotation.Description"
[39] "Annotation.Transcript.Cluster"    "Transcript.Assignments"
[41] "Annotation.Notes"
> table(dd$Gene.Sym)

  ---  Actb  ACTB Gapdh GAPDH   Hk1   Pcx STAT1  Tfrc
 7746     3     3     6     3     3     4     4     3

It seems to me that the most intuitively interesting fields of the table are
at best sparsely annotated.  You can get the probe sequences from a different
bioconductor resource (saureusprobe package) to aid in your own explorations
of probe content and context.




On Wed, Jan 25, 2012 at 9:44 AM, Marcos Pinho
<pinho.microarray at gmail.com> wrote:
> Dear List,
>
> I still having problems in creating a db file for the saureus affy chips.
> After reading the material and trying and I could not create such files.
> Can anybody please help me? I am a biologist without much R experience. I
> have my toptable comparisons but only with the affy probe IDs. Any help
> would be greatly appreciated.
>
> Regards,
>
>
> --
> Marcos B. Pinho
> Instituto Nacional de Câncer - INCA
> Rio de Janeiro - Brasil
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list