[BioC] question about limma fit differences?

Gordon Smyth smyth at wehi.edu.au
Sat Feb 7 02:29:48 MET 2004


At 05:48 PM 6/02/2004, Simon Melov wrote:
>in the limma guide, there are several different examples which are well 
>described. However, there are two different functions for doing the fit - 
>lm.series, and lmFit.
>I'm not clear as to why one would use either one or the other. Its not 
>stated as to why one uses lmFit in the  Swirl example, but uses lm.series 
>in the ApolA1 example. From their respective help menus, I cant tell the 
>difference except that lmFit seems to call the least squares regression by 
>default, while as lm.series calls the lm.fit function for the regression.

There are actually four linear model functions in limma, as explained in 
the help page on "LinearModels". I have extracted a part of this help page 
and appended it to the end of this email. lmFit() is a wrapper function, if 
you want to use that term, which calls the lower-level functions lm.series, 
gls.series or rlm.series as appropriate. There is therefore no difference 
whatever between lmFit and lm.series in your case except in the user 
interface: lmFit simply calls lm.series.

I agree that it is potentially a bit confusing that both lm.series and 
lmFit are used in the User's Guide. Ideally only lmFit would be there, but 
I haven't had time to update all the examples yet. This will be done is due 
course. The ApoAI example is still correct although it uses the older 
function call.

You say that can't tell the difference between the functions from their 
help pages. Now I know that documentation which is clear to one person is 
not necessarily clear to another, but on my reading of the help pages the 
relationship between the functions seems to be well-described. The help 
page for lmFit says "A linear model is fitted for each gene by calling one 
of 'lm.series', 'gls.series' or 'rlm.series'." The reader is then referred 
to the help page on 'LinearModels' which gives "an overview of linear model 
functions in limma". Then one can read the extract quoted below.


>Are there some general guidelines as to which fit function to use in 
>particular experimental contexts?
>Any help would be much appreciated

Quote from the help page "5.LinearModels":

      There are four functions in the package which fit linear models:

       'lmFit'  This is a high level function which accepts objects and
           provides an entry point to the following three functions.

       'lm.series'  Straightforward least squares fitting of a linear
           model for each gene.

       'rlm.series'  An alternative to 'lm.series' using robust
           regression as implemented by the 'rlm' function in the MASS

       'gls.series'  Generalized least squares taking into account
           correlations between duplicate spots (i.e., replicate spots
           on the same array). The functions 'duplicateCorrelation' or
           'dupcor.series' are used to estimate the inter-duplicate
           correlation before using 'gls.series'.

      Each of these functions accepts essentially the same argument list
      and produces a fitted model object of the same form. The first
      function 'lmFit' formally produces an object of class 'MArrayLM'.
      The other three functions are lower level functions which produce
      similar output but in unclassed lists.

More information about the Bioconductor mailing list