[BioC] pathview puzzle

Luo Weijun luo_weijun at yahoo.com
Thu Sep 5 21:42:34 CEST 2013


Pj,
This sounds like a interesting expansion, I may work on that direction in future development. Not sure exactly when this can be done. So stay tunned..
Weijun

--------------------------------------------
On Tue, 9/3/13, Pj Dias <pjdias87 at gmail.com> wrote:

 Subject: Re: [BioC] pathview puzzle
 To: "Oleg Moskvin" <moskvin at wisc.edu>

 <Bioconductor at r-project.org>
 Date: Tuesday, September 3, 2013, 1:50 PM

 Hi Weijun,
 Interesting tool pathview, is there any
 plans to extend its capabilities I/O to handle SBML
 files?
 People from metabolic modelling would
 probably benefit a lot from that extension, and SBML is more
 or less the reference input file in pathway/metabolic
 analysis.

 Regards, 
 Pj

 2013/9/3 Oleg Moskvin
 <moskvin at wisc.edu>

 Hi
 Weijun,



 This works perfectly as expected! Thank you for the fast
 update. This option is indeed destined to be very useful for
 many researchers.



 Best,



 Oleg



 On 08/30/13, Luo Weijun  wrote:

 > The updated pathview (version 1.1.5) is now available
 through BioC devel version:

 > http://bioconductor.org/packages/2.13/bioc/html/pathview.html

 > R-forge version failed to build because they haven’t
 installed some dependency package with their new R 3.0.1.
 I’ve contact their admin, but not sure when this can be
 solved.

 > Weijun

 >

 > --------------------------------------------

 > On Wed, 8/28/13, Luo

 wrote:

 >

 > Subject: Re: [BioC] pathview puzzle

 > To: "Oleg Moskvin" <moskvin at wisc.edu>

 > Cc: Bioconductor at r-project.org

 > Date: Wednesday, August 28, 2013, 2:44 PM

 >

 > Hi Oleg,

 > I just update pathview package so it can process and
 analyze

 > data labeled with KEGG gene IDs other than Entrez Gene.
 It

 > turns out that this issue affects many other species
 too. So

 > with this update, you can literaully work with all
 ~2300

 > (and more forth-coming) KEGG species data with pathview
 now.

 > I’ve also added new content with working examples on
 KEGG

 > species and Gene ID usage in page 14-16 of the
 vignette.

 > Notice that you need to specified
 gene.idtype="KEGG" when

 > calling pathview.

 > I’ve posted the new package to R-forge. You should be
 able

 > to access it in the next few hours at http://r-forge.r-project.org/R/?group_id=1619.
 Just

 > install it follow the instruction there. The Bioc
 version

 > will also be updated in the next 1-2 days: http://bioconductor.org/packages/devel/bioc/html/pathview.html.

 > Let me know how that works or if you have questions.
 HTH.

 > Weijun

 >

 > --------------------------------------------



 > wrote:

 >

 > Subject: Re: [BioC] pathview puzzle

 > To: Bioconductor at r-project.org,

 > "Oleg Moskvin" <moskvin at wisc.edu>

 > Date: Friday, August 23, 2013, 9:53 PM

 >

 > Hi Oleg,

 > Thanks for the note. This is indeed a problem I
 didn’t

 > realize previously! KEGG uses Entrez Gene ID for all
 other

 > model organisms I’ve checked.

 > I am working on a generic fix (not only for E coli but

 > other

 > species with similar situation) and will incorporate
 that

 > into the development version of pathview soon. Will
 keep

 > you

 > posted.

 > Thanks for pointing this out.

 > Weijun

 >

 >

 > --------------------------------------------

 > On Fri, 8/23/13, Oleg Moskvin <moskvin at wisc.edu>

 > wrote:

 >

 > Subject: Re: [BioC] pathview puzzle

 > To: Bioconductor at r-project.org,



 > Date: Friday, August
 23, 2013, 12:19 PM

 >

 > Hi Weijun,

 >

 > Thank you for the response.

 >

 > The problem seems to be deeper than that and is

 > connected

 > to

 > special handling of a particular species - E.coli -

 > by

 > KEGG.

 >

 >

 > I looked into the pathview() code and here is what I

 > see:

 >

 > 1) gene.data is remapped internally via mol.sum() to

 > have

 > ENTREZ IDs;

 > 2) remapped gene.data is used by node.map() to map

 > onto

 > KEGG

 > nodes using node.data

 > 3) the node.data used in (2) was originally extracted

 > from

 > the KEGG XML by node.info()

 >

 > The above route implies that the "name"
 entries in

 > the

 > KEGG

 > XML of type="gene" have
 "speciesID:ENTREZ" format...

 >

[[elided Yahoo spam]]
 See

 > the

 > examples of XML entries for H.sapience and E.coli

 > from my

 > yesterday's message (below).

 >

 > In fact, in KEGG XML for E.coli "gene"
 records

 > b-numbers

 > are

 > used as IDs!

 >

 > So, for the cases like that, when KEGG fails to be

 > consistent in the supplied XML structure, one may

 > suggest

 > introducing an "id.bypass" option to
 pathview() which

 > will

 > take the gene.data as is (with the IDs supplied by

 > user

 > that

 > match KEGG XML ids; for example, b-numbers), and pass

 > this

 > directly to the step 3 (node matching).

 >

 > Thanks!

 >

 > Oleg

 >

 >

 >

 > On 08/22/13, Luo Weijun wrote:

 > > Hi Oleg,

 > > You are right, the problem is due to ID type

 > inconsistency.

 > > You have to specify gene.idtype when calling

 > pathview

 > function, if your gene id type is not Entrez Gene. I

 > don’t

 > think b-numbers are recognized for sure. For your

 > gene

 > name

 > example, if you mean official gene symbols by

 > “gene

 > name”, you should specify
 gene.idtype="SYMBOL"

 > (lower

 > case

 > is fine):

 > > eco2.out <- pathview(gene.data =

 > T2.CEBF095.crt115.ASCH.DROP3.rel.gn,
 pathway.id =

 > "02010",

 > gene.idtype="SYMBOL", out.suffix =
 "T2ACSH", species

 > =

 > "eco", kegg.native=TRUE)

 >

 >

 > On 08/22/13, Oleg Moskvin wrote:

 >

 > >

 > > <entry id="2"
 name="hsa:51343" type="gene"

 > > link="http://www.kegg.jp/dbget-bin/www_bget?hsa:51343">

 > > <graphics name="FZR1, CDC20C, CDH1, FZR,

 > FZR2,

 > HCDH,

 > HCDH1" fgcolor="#000000"
 bgcolor="#BFFFBF"

 > > type="rectangle" x="919"
 y="536" width="46"

 > height="17"/>

 > > </entry>

 > >

 > >

 > > <entry id="4"
 name="eco:b1513" type="gene"

 > > link="http://www.kegg.jp/dbget-bin/www_bget?eco:b1513">

 > > <graphics name="lsrA"
 fgcolor="#000000"

 > bgcolor="#BFFFBF"

 > > type="rectangle" x="339"
 y="1882" width="46"

 > height="17"/>

 > > </entry>



 --

 ---------------------------------------------------------

 Oleg Moskvin, PhD 

 Associate Scientist (Computational Biology)

 Great Lakes Bioenergy Research Center

 University of Wisconsin-Madison

 1552 University Ave Room 4241 

 Madison, Wisconsin 53706

 Phone: (608) 890-2361

 Email: moskvin at wisc.edu



 _______________________________________________

 Bioconductor mailing list

 Bioconductor at r-project.org

 https://stat.ethz.ch/mailman/listinfo/bioconductor

 Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list