[R] R dplyr solution vs. Base R solution for the slect column total

Muhuri, Pradip (SAMHSA/CBHSQ) Pradip.Muhuri at samhsa.hhs.gov
Mon Dec 1 03:29:54 CET 2014


Hi Duncan,

Thank you for sending your solution.  Below is another way.  

Pradip

> test <- data.frame(first=c(1,2),  second=c(3,4)) 
> total <- c("", sum(test$second))
> rbind(test, Total=total)
      first second
1         1      3
2         2      4
Total            7

> rbind(test, c("Total", colSums(test[,2, drop=FALSE])))
  first second
1     1      3
2     2      4
3 Total      7

Pradip K. Muhuri, PhD
SAMHSA/CBHSQ
1 Choke Cherry Road, Room 2-1071
Rockville, MD 20857
Tel: 240-276-1070
Fax: 240-276-1260


-----Original Message-----
From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com] 
Sent: Sunday, November 30, 2014 9:16 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ); 'Boris Steipe'
Cc: r-help at r-project.org
Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total

On 30/11/2014, 8:45 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:
> Hi Boris,
> 
> Sorry for not being explicit when replying to your first email.   I wanted to say it does not work when row-binding.  I want the following output.  Thanks,  Pradip
> 
> 
> 1            1      3
> 2            2      4
> Total              7

You are mixing up the computation of results with the presentation of them.  That's the spreadsheet way of thinking, and it's okay for simple things like this, but gets really bogged down when the computations get hard.

In R you can do it, and it's not too hard:

test <- data.frame(first=c(1,2), second=c(3,4)) total <- c("", sum(test$second)) rbind(test, Total=total)

but this isn't a really sensible thing to do:  you can't work with that final result at all.  It makes more sense to leave it in the original form, and then think about how you want to present it, and write a function that displays the result, with nice formatting, etc.  That probably won't happen in the R console, you should be using Sweave or knitr or some other package for presentation of the results.

Duncan Murdoch


> 
> ################### Below is the console ##########
>> test <- data.frame(first=c(1,2), second=c(3,4)) test
>   first second
> 1     1      3
> 2     2      4
>>
>> sum(test$second)
> [1] 7
>>
>> rbind(test, sum(test$second))
>   first second
> 1     1      3
> 2     2      4
> 3     7      7
> 
> Pradip K. Muhuri, PhD
> SAMHSA/CBHSQ
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
> Tel: 240-276-1070
> Fax: 240-276-1260
> 
> -----Original Message-----
> From: Boris Steipe [mailto:boris.steipe at utoronto.ca]
> Sent: Sunday, November 30, 2014 5:51 PM
> To: Muhuri, Pradip (SAMHSA/CBHSQ)
> Cc: r-help at r-project.org
> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect 
> column total
> 
> No it doesn't ...
> consider:
> 
> test <- data.frame(first=c(1,2), second=c(3,4)) test
>   first second
> 1     1      3
> 2     2      4
> 
> sum(test$second)
> [1] 7
> 
> 
> 
> 
> On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:
> 
>> Hi Boris,
>>
>> That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column.
>>
>> Thanks,
>>
>> Pradip Muhuri
>>
>>
>>
>> -----Original Message-----
>> From: Boris Steipe [mailto:boris.steipe at utoronto.ca]
>> Sent: Sunday, November 30, 2014 12:50 PM
>> To: Muhuri, Pradip (SAMHSA/CBHSQ)
>> Cc: r-help at r-project.org
>> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect 
>> column total
>>
>> try:
>>
>> sum(test$count)
>>
>>
>> B.
>>
>>
>> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:
>>
>>> Hello,
>>>
>>> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want.
>>> rbind(test, colSums(test))
>>>
>>> I only want the total for the very last column.  I am struggling 
>>> with this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow.  I found  some mutate() code for the cumsum but no luck for the select column total.  Is there a dplyr solution for the select column total?
>>>
>>> Any hints will be appreciated.
>>>
>>> Thanks,
>>>
>>> Pradip Muhuri
>>>
>>>
>>> ####### The following is from the console - the R script with reproducible example is also appended.
>>>
>>>
>>> mrjflag cocflag inhflag halflag oidflag count
>>> 1        0       0       0       0       0   256
>>> 2        0       0       0       1       1   256
>>> 3        0       0       1       0       1   256
>>> 4        0       0       1       1       1   256
>>> 5        0       1       0       0       1   256
>>> 6        0       1       0       1       1   256
>>> 7        0       1       1       0       1   256
>>> 8        0       1       1       1       1   256
>>> 9        1       0       0       0       1   256
>>> 10       1       0       0       1       1   256
>>> 11       1       0       1       0       1   256
>>> 12       1       0       1       1       1   256
>>> 13       1       1       0       0       1   256
>>> 14       1       1       0       1       1   256
>>> 15       1       1       1       0       1   256
>>> 16       1       1       1       1       1   256
>>> 17       8       8       8       8      15  4096
>>>
>>>
>>>
>>> #######################  below is the reproducible example 
>>> ########################
>>> library(dplyr)
>>> # generate data
>>> dlist <- rep( list( 0:1 ), 4 )
>>> data <- do.call(expand.grid, drbind) data$id <- 1:nrow(data)
>>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag')
>>>
>>>
>>> # mutate a column and then sumamrize
>>> test <- data %>%
>>>      mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>%
>>>      group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>%
>>>      summarise(count=n()) %>%
>>>      arrange(mrjflag,cocflag, inhflag, halflag, oidflag)
>>>
>>>
>>> #  This works, giving me the total for each column - This is not what I exactly want.
>>>   rbind(test, colSums(test))
>>>
>>> # I only want the total for the very last column rbind(test, 
>>> c("Total", colSums(test, ...)))
>>>
>>> Pradip K. Muhuri, PhD
>>> SAMHSA/CBHSQ
>>> 1 Choke Cherry Road, Room 2-1071
>>> Rockville, MD 20857
>>> Tel: 240-276-1070
>>> Fax: 240-276-1260
>>>
>>>
>>> 	[[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list