[BioC] Re: Defining your own chromosome annotations

Palmer, Lance palmer at cshl.edu
Tue May 3 16:27:56 CEST 2005


Sean, 
 
I am using Yersinia pestis microarray from TIGR.  This chip is constructed from genes from two different strains of Y. pestis, KIM and CO92.  It is mainly CO92 (3885 ORFs) but has 944 ORFS from KIM not found in CO92 (and some genes are on main chromosome, and some on virulence plasmids).  The GAL file does not contain any genbank IDs.  It does contain locus names.  The gpr files I have do not contain the locus name, but instead has a name that is just used in the chip I believe.  The genbank files for KIM and CO92 contain the locus names, and of course the genbank IDs.  (ie In the GPR file the name of a gene is, for example, NTORF3478. In the gal file, the name NTORF3478 has the locus name y3526.  Then the genbank file will contain the y3526 gene and have the GID of 1148473) 
 
 
I am not sure if all features in the chip are present in the genbank files.  If I wanted to use the KIM chromosome as the reference, however, the genes from CO92 would not be mapped properly.
 
I was just hoping there would be an easy way of having a file that would be something like
Name on chip, Contig Name, Start, End 
Load that into an object, then after running limma, view expression along the chromosome.
 
-Lance
 
-------------------------------
Lance,

You probably want to look at the AnnBuilder package, but I don't think it
supports bacterial genomes (? for Jianhua).  However, what is the annotation
that you have for each "gene"?  Genbank accession?  Refseq?  Do you have the
chromosome locations?  What arrays are you using?

Sean

----- Original Message -----
From: "Palmer, Lance" <palmer at cshl.edu>
To: <bioconductor at stat.math.ethz.ch>
Sent: Monday, May 02, 2005 11:09 AM
Subject: [BioC] Defining your own chromosome annotations


>I am working with a number of bacterial genomes.  I would like to define my
>own chromosome and annotations along the chromosomes, then view gene
>expression with regards to these genes.  geneplotter and annotate seem to
>use already available data structures.  Is there a way for a use to design
>there own?
>
> Thanks
> Lance Palmer
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



------------------------------

Message: 13
Date: Tue, 03 May 2005 11:13:51 +1200
From: "Marcus Davy" <MDavy at hortresearch.co.nz>
Subject: Re: [BioC] LIMMA ignoring background
To: <bioconductor at stat.math.ethz.ch>, <guoneng.zhong at yale.edu>
Message-ID: <s2775d0c.081 at hra2.marc.hort.cri.nz>
Content-Type: text/plain;       charset=US-ASCII


there are several options,
you could modify read.maimages so that it can read in only Rf Gf, you could read in the Rb and Gb information as the foreground information aswell and remove it afterwards from the list elements of the RGList,
e.g. for GenePix, read.maimages(files[1], "genepix", columns= list(Rf = "F635 Mean", Gf = "F532 Mean", Rb = "F635 Mean", Gb = "F532 Mean")),
or you could generate an RGList from scratch populating it with you data loaded into R using something like scan or read.table.

e.g.
library(limma)

RG <- new("RGList")
# Two arrays, unrealistic data...
RG$R <- matrix(2^rnorm(8*4*20*20*2), nc=2)
RG$G <- matrix(2^rnorm(8*4*20*20*2), nc=2)

RG$printer <- list(ngrid.r=8, ngrid.c=4, nspot.r=20, nspot.c=20)
RG$printer <- structure(printer, class = "PrintLayout")

MA <- normalizeWithinArrays(RG,  method = "printtiploess", bc.method="none")

design <- rep(1,2)
fit <- lmFit(MA, design)
fit <- eBayes(fit)
topTable(fit, adjust.method="fdr")

You can populate exprSets and marrayRaw Class objects the same way using appropriate accessor methods.

Marcus

Marcus Davy
Bioinformatics
>>> Guoneng Zhong <guoneng.zhong at yale.edu> 05/03/05 8:39 AM >>>

Hi,

I have two-channel image result files that don't have background information,
just median and mean intensity readings.  But read.maimages requires that I
provide Rb, Rf, Gb, Gf values, and I don't have the Rb and Gb values.  How do I
make the analysis ignore those columns?  I am doing a simple lmFit and topTable.

THanks!
G

--

Systems Programmer
Yale Center for Medical Informatics
fax: 203-737-5708

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor


______________________________________________________

The contents of this e-mail are privileged and/or confidenti...{{dropped}}



------------------------------

Message: 14
Date: Tue, 3 May 2005 01:46:22 +0200
From: "Gorjanc Gregor" <Gregor.Gorjanc at bfro.uni-lj.si>
Subject: [BioC] "Special" characters in URI
To: <r-help at stat.math.ethz.ch>
Cc: bioconductor at stat.math.ethz.ch
Message-ID:
        <7FFEE688B57D7346BC6241C55900E730B700C2 at pollux.bfro.uni-lj.si>
Content-Type: text/plain;       charset="iso-8859-2"

Hello!

I am crossposting this to R-help and BioC, since it is relevant to both
groups.

I wrote a wrapper for Entrez search utility (link for this is provided bellow),
which can add some new search functionality to existing code in Bioconductor's
package 'annotate'*.

http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html

Entrez search utuility returns a XML document but I have a problem to
use URI to retrieve that file, since URI can also contain characters,
which should not be there according to

http://www.faqs.org/rfcs/rfc2396.html

I encountered problems with "[" and "]" as well as with space characters.
However there might also be a problem with others i.e. reserved characters
in URI syntax.

My R example is:

R> library("annotate")
Loading required package: Biobase
Loading required package: tools
Welcome to Bioconductor
         Vignettes contain introductory material.  To view,
         simply type: openVignette()
         For details on reading vignettes, see
         the openVignette help page.
R> library(XML)
R> tmp$term <- "gorjanc g[au]"
R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au]"
R> tmp
$term
[1] "gorjanc g[au]"

$URL
[1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au]"
R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE)
Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) :
        error in creating parser for http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au]

# so I have a problem with space and [ and ]
# let's reduce a problem to just space or [] to be sure
R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g"
R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE)
Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) :
        error in creating parser for http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g
R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc[au]"
R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE)
Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) :
        error in creating parser for http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc[au]

# now show that it works fine without special chars
R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc"
R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE)
$doc
$file
[1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc"

$version
[1] "1.0"

$children
...

# now show a workaround for space
tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc%20g"
xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE)
R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc%20g"
R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE)
$doc
$file
[1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc%20g"

$version
[1] "1.0"

$children
...

As can be seen from above there is a possibility to handle this special
characters and I wonder if this has already been done somewhere? If not
I thought on a function fixURLchar, which would replace reserved characters
with ther escaped sequences. Any comments, pointers, ... ?

from = c(" ", "\"", ",", "#"),
to = c("%20", "%22", "%2c", "%23"))

*When I'll solve problem I will send my code to 'annotate' maintainer
and he can include it at his will in a package.

Lep pozdrav / With regards,
    Gregor Gorjanc

----------------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty        URI: http://www.bfro.uni-lj.si/MR/ggorjan
Zootechnical Department     mail: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3                   tel: +386 (0)1 72 17 861
SI-1230 Domzale             fax: +386 (0)1 72 17 888
Slovenia, Europe
----------------------------------------------------------------------
"One must learn by doing the thing; for though you think you know it,
 you have no certainty until you try." Sophocles ~ 450 B.C.



------------------------------

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor


End of Bioconductor Digest, Vol 27, Issue 3
*******************************************



More information about the Bioconductor mailing list