[BioC] What are the best packages to compare multiple DE gene lists?

Jose Garcia garciamanteiga.josemanuel at hsr.it
Thu Aug 28 10:11:43 CEST 2014


Dear Stephane,
If I understood well what you need, you could use RankProd package that
uses rank product non parametric approach to give a  p-value to the genes
across different studies based on the ranking they achieve by log2FC. It
permits to make such "meta-analyses" comparing lists of genes produced in
different analysis.
Jose


2014-08-28 9:03 GMT+02:00 Stephane Plaisance | VIB | <
stephane.plaisance at vib.be>:

> Dear Jim,
>
> Thanks very much for this straightforward approach. I will certainly try
> it. My aim is to also take into account the pvalues and if applicable also
> the related log-FC values attached to each gene so that more than just
> ranking is used. I know of biotools (endeavour) that ranks lists of apples
> and peers and use specific methods but have no idea where exact to start.
>
> Thanks anyway for the help and code.
>
> So far I have found in the Bioc pages:
> matchbox
> Orderedlist
> geneselector
> rankrank
>
> I have tried none so if anybody has preferences, I am all ears.
>
> Cheers
> Stephane Plaisance
> stephane.plaisance at vib.be
>
>
>
>
>
> On 27 Aug 2014, at 16:29, James W. MacDonald <jmacdon at uw.edu> wrote:
>
> > Hi Stephane,
> >
> > If I understand you correctly, you have already made comparisons and now
> simply want to rank genes based on the number of comparisons in which they
> were found significant. I don't know of a particular package for doing
> this, and it would be really easy to do using functions in base R. All you
> would need to do (assuming you have some consistent identifier like Entrez
> Gene IDs for each comparison), would be to concatenate all the IDs into a
> single vector, and then count occurences:
> >
> > mybigvec <- c(<all the DE gene IDs go here>)
> > mylst <- split(mybigvec, mybigvec)
> > df <- data.frame(ID=names(mylst), count=sapply(mylist, length))
> > df <- df[order(df$count, decreasing = TRUE),]
> >
> > You could also take things like gene symbols along for the ride by
> starting with a data.frame:
> >
> > mybigdf <- data.frame(symbols = <concatenate symbols from all comps>,
> geneid = <concatenate gene IDs from all comps>)
> > mylst <- split(mybigdf, mybigdf$geneid)
> > df <- data.frame(ID = names(mylst), count = sapply(mylst, nrow), symbol
> = sapply(mylst, function(x) x$symbol[1]))
> > df <- df[order(df$count, decreasing = TRUE),]
> >
> > Best,
> >
> > Jim
> >
> >
> >
> >
> > On Wed, Aug 27, 2014 at 6:48 AM, Stephane Plaisance | VIB | <
> stephane.plaisance at vib.be> wrote:
> > I have full genome/exome lists of DE resulting from MA and/or RNASeq
> analyses using multiple methods (likely showing different gene even from
> the same samples due to technology biases). I would like to rank these
> lists to create a general list where redundant DE targets are pushed up and
> unique hits ranked lower.
> >
> > What method/package should I start with?
> >
> > Thanks
> >
> > Stephane Plaisance
> > stephane.plaisance at vib.be
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
> >
> >
> > --
> > James W. MacDonald, M.S.
> > Biostatistician
> > University of Washington
> > Environmental and Occupational Health Sciences
> > 4225 Roosevelt Way NE, # 100
> > Seattle WA 98105-6099
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



-- 
Jose M. Garcia Manteiga PhD
Data Analysis in Functional Genomics
Center for Translational Genomics and BioInformatics
Dibit2-Basilica, 4A3
San Raffaele Scientific Institute
Via Olgettina 58, 20132 Milano (MI), Italy

Tel: +39-02-2643-9144
e-mail: garciamanteiga.josemanuel at hsr.it

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list