[R] Tables package - remove NAs and NaN

Tue Apr 23 14:13:10 CEST 2013

On 13-04-23 6:31 AM, Duncan Murdoch wrote:
> On 13-04-22 10:40 PM, David Winsemius wrote:
>>
>> On Apr 22, 2013, at 5:49 PM, Santosh wrote:
>>
>>> Dear Rxperts,
>>> q <- data.frame(p=rep(c("A","B"),each=10,len=30),
>>> a=rep(c(1,2,3),each=10),id=seq(30),
>>> b=round(runif(30,10,20)),
>>> c=round(runif(30,40,70)))
>>> The operation below...
>>> tabular(((p=factor(p))*(a=factor(a))+1) ~ (N = 1) + (b + c)*
>>> (mean+sd),data=q)
>>> yields some rows of NAs and NaN as shown below
>>>
>>>               b               c
>>> p a   N  mean  sd    mean  sd
>>> A 1   10 16.30 2.497 52.30  9.358
>>>     2    0   NaN    NA   NaN     NA
>>>     3   10 15.60 2.716 60.30  8.001
>>> B 1    0   NaN    NA   NaN     NA
>>>     2   10 15.40 2.366 57.70 10.414
>>>     3    0   NaN    NA   NaN     NA
>>>     All 30 15.77 2.473 56.77  9.601
>>>
>>> How do I remove the rows having N=0 ?
>>> I would like the resulting table look like..
>>>               b               c
>>> p a   N  mean  sd    mean  sd
>>> A 1   10 16.30 2.497 52.30  9.358
>>>       3   10 15.60 2.716 60.30  8.001
>>> B  2   10 15.40 2.366 57.70 10.414
>>>     All 30 15.77 2.473 56.77  9.601
>>
>> Here's a bit of a hack:
>>
>> tabular( (`p a`=interaction(p,a, drop=TRUE, sep=" ")) ~ (N = 1) + (b + c)*
>>       (mean+sd),data=q)
>>
>>           b           c
>>    p a N  mean sd     mean sd
>>    A 1 10 12.8 0.7888 52.1 8.020
>>    B 2 10 16.3 3.0569 54.9 8.711
>>    A 3 10 14.6 3.7771 56.5 6.980
>>
>> I have been rather hoping that Duncan Murdoch would have noticed the earlier thread, but maybe he can comment on whether there is a more direct route/
>>
>
> This isn't something that the package is designed to handle:  if you say
> p*a, it wants all combinations of p and a.
>
> If I wanted a table like that, I'd use a different hack.  One
> possibility is to create that interaction column, but display it as just
> the initial letter, labelled p, and then add another column to contain
> the a values as data.  It would be tricky to get the formatting right.
>
> Another possibility is to generate the whole table with the N=0 rows,
> and then post-process it to remove those rows, and adjust the row labels
> appropriately.  This approach probably gives the nicer result, but the
> post-processing is quite messy:  you need to delete some rows from the
> table, from its rowLabels attribute, and from the justification
> attributes of both the table and its rowLabels.  (I should add a [
> method to the package to hide this messiness.)

I've done this now, in version 0.7.54 on R-forge.  To leave out the rows 
with N=0, you can select a subset of the table where N (the first 
column) is non-zero:

tab <- tabular(((p=factor(p))*(a=factor(a))+1) ~ (N = 1) + (b + 
c)*(mean+sd),data=q)

tab[ tab[,1] > 0, ]

and it produces this:

          b           c
  p a   N  mean  sd    mean sd
  A 1   10 16.20 3.458 56.3 10.155
    3   10 13.60 2.119 58.1  8.075
  B 2   10 14.40 2.547 51.2  9.438
    All 30 14.73 2.888 55.2  9.419

Indexing of tables isn't as general as indexing of matrices, but most of 
the simple forms should work.  I haven't tested yet, but I expect this 
will be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.

Duncan Murdoch