[Rd] incorrect handling of NAs by na.action with lmList (package nlme) (PR#13658)

charles.beaudette at usherbrooke.ca charles.beaudette at usherbrooke.ca
Thu Apr 16 03:10:12 CEST 2009


Greetings,

I just found out a bug in the function lmList of the package nlme with R 
2.8.1 running under windows XP 32-bits. I have a data table with various 
columns corresponding to continuous variables as well as treatment 
variables taken on several years and several sites. Here is an example :

Id      year        treatment A      treatment B      variable1      
variable2      variable3
1       1             0                      0                      
6.47             3.76              NA
2       1             0                      1                      
3.23             2.15              NA
3       1             1                      0                      NA  
            8.43              NA
4       1             1                      1                      
2.15             9.34              NA
...      ...            ...                     ...                     
...                 ...                  ...
55     2             0                      0                      
5.93             5.35              6.34
56     2             0                      1                      
6.64             2.87              4.23
57     2             1                      0                      
4.23             NA               8.45
58     2             1                      1                      
3.67             8.54              7.45
...      ...            ...                     ...                     
...                 ...                  ...
105   3             0                      0                      7.45   
          8.34              7.65
106   3             0                      1                      7.98   
          9.45              9.23
107   3             1                      0                      4.56   
          8.23              8.34
108   3             1                      1                      6.34   
          9.34              5.98

As you can see, data for variable3 are missing for the first year, but 
otherwise there is little data missing. I call upon lmList :

lmgroup.data <- lmList (variable1 ~ treatmentB | year/treatmentA, data = 
data, na.action = na.omit)

When I call the object, I see :

Call:
  Model: variable1 ~ treatmentB | year/treatmentA
   Data: data

Coefficients:
                                  (Intercept)                treatment B
year2/treatmentA0        44.08387                81.11284
year2/treatmentA1        66.61333                155.62163
year3/treatmentA0        60.55125                72.83121 
year3/treatmentA1        63.62340                161.92080

Degrees of freedom: 188 total; 176 residual
Residual standard error: 24.09452

There is no data for year 1, but I didn't add variable 3 to my model, so 
the estimates should be there. When I create another data.frame and I 
omit to put in it variable 3, estimates for year 1 appear as they 
should. In my opinion, there is clearly a mishandling of the na.action 
argument here. It should omit the NAs in the vectors/columns of interest 
for the models, not the whole data.frame. I hope my explanations were 
clear and brief enough for you guys and thank you for a terrific job, R 
is a jewel of the open-source community and I enjoy working with it 
everyday !

Best Regards,

Charles Beaudette
Candidate for the master's degree in biology
University of Sherbrooke



More information about the R-devel mailing list