Strange Results of summary()

Martin Maechler Martin Maechler <>
Fri, 20 Mar 1998 09:34:17 +0100

>>>>> "Hubert" == Hubert Palme <> writes:

    Hubert> Martin Maechler:
    >> >> berufl >> Bureaukraft :15 >> Guetererzeugung : 9 >> sonstige : 4
    >> >> Handel : 3 >> wissensch.-technisch: 3 >> (Other) : 3 >> NA's :43
    >> .....
    >> >> > table(berufl) >> wissensch.-technisch Leiter
    >> Oeff. Dienst/Wirtschaft >> 3 0 >> Bureaukraft Handel >> 15 3 >>
    >> Dienstleistungsgewerbe/Soldat Gaertner/Jaeger >> 2 1 >>
    >> Guetererzeugung sonstige >> 9 4
    >> What's the problem?  '(Other)' gives all the levels having (in your
    >> case) 0,1,2 observations, which sum to 3 observations.

    Hubert> Do I understand you right, that the variables with low
    Hubert> frequency are put togehter in (other)? This should be explained
    Hubert> to a newbie!!

    Hubert> - What criteria decides which variables are put into (other)?
    Hubert> - What kind of order do the values have? Frequency?

    Hubert> This is very irritating! Where can I get information about all
    Hubert> this?

Read the online help ?   Read the R-notes,  read books about S / S-plus...

More seriously:
1) In situations like these, I just look at the R code;
   in this case, you'll find  summary -> -> summary.factor
   and you'll see that  is, maxsum = 7, ...)
   where ``maxsum'' is the argument you may want to use differently..

2) The online help for summary  has been lacking.
   R 0.62  will have an improved help page, whose ASCII version I append at
   the end. 

    >> table() is more detailed (but doesn't report the NA's), which is the
    >> only thing to critize here:

    Hubert> I agree.

Should be in the 0.62 version...

    Hubert> (Hmm... R is a very interesting and powerfull tool, but it's
    Hubert> philosophy and terminology need much accustomization for one
    Hubert> being familiar with SPSS & Co.)

I agree;  implicitly we have often assumed that R users
	- either know  S / Splus
	- or	 are good programmers
Lack in documentation ``proves'' the above.
And yes, we welcome all collaboration in improving documentation!

Here is the new help page  [ ?summary or ?summary.factor or ....] :

    Object Summaries

	 summary(object, ...)

	 summary.default   (object, ..., digits = max(3, .Options$digits -3)), maxsum = 7, ...)
	 summary.factor    (object, maxsum = 100, ...)
	 summary.matrix    (object, ...)


      object: an object for which a summary is desired.

      maxsum: integer, indicating how many levels should be
	      shown for `factor's.

	 ...: additional arguments affecting the summary pro-


	 `summary' is a generic function used to produce result
	 summaries of the results of various model fitting func-
	 tions.  The function invokes particular `method's which
	 depend on the `class' of the first argument.

	 For `factor's, the frequency of the first `maxsum - 1'
	 most frequent levels is shown, where the less frequent
	 levels are summarized in `"(Others)"' (resulting in
	 `maxsum' frequencies).

	 The functions `summary.lm' and `summary.glm' are exam-
	 ples of particular methods which summarise the results
	 produced by `lm' and `glm'.


	 The form of the value returned by `summary' depends on
	 the class of its argument.  See the documentation of
	 the particular methods for details of what is produced
	 by that method.

    See Also:

	 `anova', `summary.glm', `summary.lm'.


	 summary(attenu) #->
         summary(attenu $ station, maxsum = 20) #-> summary.factor(..)

r-devel mailing list -- Read
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: