[BioC] reading data into limma

Gordon K Smyth smyth at wehi.EDU.AU
Thu Jul 31 16:37:09 MEST 2003


Dear Juerg,

> Hi Gordon,
>
> The list colList has the value:
> colList<-list(Gf="CH1I_MEAN", Gb="CH1B_MEAN", Rf="CH2I_MEAN",
> Rg="CH2B_MEAN")
>
> The first 50 lines of the first data file are attached.

Thanks for sending this information.  You have a typo in your colList in
that you should have Rb= instead of Rg=.  If you correct that then the
following command should work for you:

  RG <- read.maimages(files, columns=colList, check.names=FALSE, fill=TRUE)

The main problem you had is that the SMD data file has column names with
underline characters "_" which are normally converted by R into "." when
the file is read into a data.frame.  Therefore read.maimages has trouble
finding the columns specified by colList.  The column name conversion can
be prevented by passing the argument check.names=FALSE to read.maimages.

Another problem, which you had already recognized, is that the last column
of the data file consists entirely of missing values, so fill=TRUE is
required to make sure that the number of data columns agrees with the
number of headings.

Best wishes
Gordon

> Thank you very much for your help.
>
> Juerg Straubhaar, PhD
> University of Massachusetts Medical School
>
>
> -----Original Message-----
> From: Gordon Smyth [mailto:smyth at wehi.edu.au]
> Sent: Wednesday, July 30, 2003 6:33 AM
> To: Straubhaar, Juerg
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] reading data into limma
>
> At 03:38 AM 30/07/2003, Straubhaar, Juerg wrote:
>>Hi,
>>
>>I am using two-colour spotted array data which I downloaded from the
>> Stanford Microarray Database generated with GenePix 4000B. To read the
>> data I use:
>>
>>  RG<-read.maimages(files, source="genepix", path=".", columns=colList,
>>fill=TRUE)
>
> There is no reason why this command should not work if your inputs
> 'files'
> and 'colList' are specified correctly. Note however that
> source="genepix"
> has no meaning here. Data from the Stanford Microarray Database are in
> SMD
> rather than Genepix format, and 'source' is ignored anyway when
> 'columns'
> is specified.
>
> I would be prepared to give more help if you give the value of 'colList'
>
> and the first 50 lines of the first data file.
>
> Gordon
>
>>colList is a list of name=value pairs, specifying the forground and
>> background intensity columns for the two channels.
>>
>>This generates an error:
>>
>>Error in "[<-"(*tmp*, , i, value = NULL) :
>>
>>         number of items to replace is not a multiple of replacement
>>length.
>>
>>Thank you,
>>
>>Juerg Straubhaar
>>
>>         [[alternative HTML version deleted]]
>
> ------------------------------------------------------------------------
> ---------------
> Dr Gordon K Smyth, Senior Research Scientist, Bioinformatics,
> Walter and Eliza Hall Institute of Medical Research,
> 1G Royal Parade, Parkville, Vic 3050, Australia
> Tel: (03) 9345 2326, Fax (03) 9347 0852,
> Email: smyth at wehi.edu.au, www: http://www.statsci.org



More information about the Bioconductor mailing list