[BioC] phenotypic information of ALLMLL data set

Sat Jun 5 10:32:46 CEST 2010

Dear Javier

Ben might be able to provide more insight about the phenoData of the 
data in the ALLMLL package, but note that 20 samples is a very small 
number in a study of patient samples, and biologically, the results 
might not be very powerful.

Since there have been quite a few experiments on pediatric blood cancers 
over the last decade, you could also have a look at the ArrayExpress or 
GEO databases for other datasets relevant to your question. The 
Bioconductor packages ArrayExpress and GEOquery help with downloading 
them directly into Bioconductor objects.

A query for "childhood leukemia" in ArrayExpress leads to 26 datasets. 
For instance:

library("ArrayExpress")
x = ArrayExpress("E-GEOD-11877")  ## may take a little while
x

#AffyBatch object
#size of arrays=1164x1164 features (488 kb)
#cdf=HG-U133_Plus_2 (54675 affyids)
#number of samples=207
#number of genes=54675
#annotation=hgu133plus2
#...

I have not delved deeper into this particular dataset, it seems that you 
then need to do some further parsing of the slot x$Description in order 
to extract the phenotypic variables (such things depend on the amount of 
care that the submitters, or the curators at GEO or ArrayExpress, have 
spent on this).

Best wishes
  Wolfgang

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber

On 02/06/10 09:07, Javier Pérez Florido wrote:
> Dear list,
> I'm using ALLMLL data set (from ALLMLL Bioconductor package). This
> package provides probe-level data for 20 HGU133A (MLL.A) and 20 HGU133B
> (MLL.B) arrays which are a subset of arrays from a large ALL study.
>
> I would like to know the phenotypic information about these data sets to
> run a differential expression analysis: I need the phenotypic info to
> group the samples by conditions. I had a look at the supplementary
> information of the paper related to this data set, but I cannot make a
> relationship between the sample names and conditions.
>
> Any suggestions?
>
> Thanks,
> Javier
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor