[BioC] merging two sets of genes

Thu Dec 29 12:56:16 CET 2005

Hi, sorry, I think I found the answer to my previous email.

setdiff() will do the trick, right?

I also found the help for %in%, it was under ?'%in%'

best,

David

> Hi,
>   thanks for the clarification. Then it depends on whether you want 
to 
> use the union or the intersection of the probes you selected in the 
two 
> different ways.
> union and intersect, applied to geneNames(PinS) and geneNames of 
PinC 
> should get you somewhere close, you might also want to consider 
match 
> and %in%, depending on just how you want to select.
> After that, you will need to create a matrix with the combined 
> expressions and use that as input in a call to
>   new, the vignettes for Biobase should demonstrate how to make an 
> exprSet from a matrix, but please ask if anything is not clear
> 
> best wishes
>    Robert
> 
> kfbargad at ehu.es wrote:
> > Dear Seth and Robert,
> > 
> > I apologise, but I didn´t make myself clear. 
> > 
> > PinS and PinC come from the same experiment, i.e. the same eset. 
It is 
> > just that I followed two different approaches to the analysis and 
now 
> > I want to continue working with the union of these two lists. So I 
am 
> > not intending to match across different arrays.
> > 
> > Hope this explains my question
> > 
> > David
> > 
> > 
> >>Hi,
> >>  I think that the problem is that the arrays are not the same - 
and 
> >>then life is much harder. There are some papers on it (G. 
Parmigiani 
> > 
> > et 
> > 
> >>al have produced MergeMaid, as one option). I have done some work 
on 
> >>this problem, with Wolfgang Huber and Markus Rauschaupt (you can 
> > 
> > find 
> > 
> >>the technical report under the Bioconductor publications link - I 
> > 
> > hope).
> > 
> >>  It is not so simple to match across different arrays, where 
> > 
> > different 
> > 
> >>probes were used (you can take the expedient of mapping to some 
> > 
> > common 
> > 
> >>set of IDs and matching on those, some code in packages GeneMeta 
and 
> >>GeneMetaEx, if I recall correctly), but just because they map to 
the 
> >>same Entrez gene id (for example) does not mean that the same 
thing 
> > 
> > was 
> > 
> >>measured - whence MergeMaid and similar tools.
> >>
> >>  And if this is correct, then combining them is contra-indicated 
> > 
> > and 
> > 
> >>some of the tools for synthesizing experiments, such as meta-
> > 
> > analysis or 
> > 
> >>the more general random effects models will be needed. Just 
because 
> > 
> > you 
> > 
> >>can jam, either the raw data or the processed data together, does 
> > 
> > not 
> > 
> >>mean that it is sensible to do so.
> >>
> >>And finally, even if the arrays are identical, unless they were 
all 
> >>essentially done at the same time under very similar conditions I 
> > 
> > would 
> > 
> >>still take the approach in the paragraph above and use a random 
> > 
> > effects 
> > 
> >>model.
> >>
> >>  best wishes
> >>    Robert
> >>
> >>
> >>Seth Falcon wrote:
> >>
> >>>On 26 Dec 2005, kfbargad at ehu.es wrote:
> >>>
> >>>
> >>>
> >>>>Dear list,
> >>>>
> >>>>I have two sets of genes from the same experiment,
> >>>>
> >>>>
> >>>>
> >>>>>PinC
> >>>>
> >>>>Expression Set (exprSet) with 
> >>>>1310 genes
> >>>>8 samples
> >>>>phenoData object with 2 variables and 8 cases
> >>>>varLabels
> >>>>FileName: read from file
> >>>>Target: read from file
> >>>>
> >>>>
> >>>>>PinS
> >>>>
> >>>>Expression Set (exprSet) with 
> >>>>2891 genes
> >>>>8 samples
> >>>>phenoData object with 2 variables and 8 cases
> >>>>varLabels
> >>>>FileName: read from file
> >>>>Target: read from file
> >>>>
> >>>>
> >>>>How can I merge these two sets? I tried union() on two vectors
> >>>>created from the probe IDs but failed. Any hints?
> >>>
> >>>
> >>>One approach would be to create a new exprSet object manually 
using
> >>>the data from PinC and PinS.  Basically, create a new phenoData 
> > 
> > object
> > 
> >>>with the data for all 16 cases, and a new epxression matrix with 
16
> >>>columns (assuming the two original exprSets represent disjoint 
> > 
> > sets of
> > 
> >>>samples).
> >>>
> >>>Thinking out loud, is this a common enough operation to warrant a
> >>>method for exprSets?  I could imagine c() being defined on 
exprSets
> >>>such that if the phenoData columns are the same and the "sample 
> > 
> > ids"
> > 
> >>>as given by the rownames of phenoData/colnames of exprs are 
> > 
> > disjoint,
> > 
> >>>then do the obvious thing, else error.
> >>>
> >>>+ seth
> >>>
> >>>_______________________________________________
> >>>Bioconductor mailing list
> >>>Bioconductor at stat.math.ethz.ch
> >>>https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>
> >>
> >>-- 
> >>Robert Gentleman, PhD
> >>Program in Computational Biology
> >>Division of Public Health Sciences
> >>Fred Hutchinson Cancer Research Center
> >>1100 Fairview Ave. N, M2-B876
> >>PO Box 19024
> >>Seattle, Washington 98109-1024
> >>206-667-7700
> >>rgentlem at fhcrc.org
> >>
> >>_______________________________________________
> >>Bioconductor mailing list
> >>Bioconductor at stat.math.ethz.ch
> >>https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>
> > 
> > 
> > 
> > 
> 
> -- 
> Robert Gentleman, PhD
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> PO Box 19024
> Seattle, Washington 98109-1024
> 206-667-7700
> rgentlem at fhcrc.org
>