[R] R dplyr solution vs. Base R solution for the slect column total
Muhuri, Pradip (SAMHSA/CBHSQ)
Pradip.Muhuri at samhsa.hhs.gov
Mon Dec 1 04:08:03 CET 2014
Hi Boris,
Excellent point. Yes, I want to convert it into to the numeric type. Your code has worked out well on the real data set. The issue is resolved.
Thanks so much for your help!
Pradip
-----Original Message-----
From: Boris Steipe [mailto:boris.steipe at utoronto.ca]
Sent: Sunday, November 30, 2014 9:42 PM
To: Muhuri, Pradip (SAMHSA/CBHSQ)
Cc: r-help at r-project.org
Subject: Re: [R] R dplyr solution vs. Base R solution for the slect column total
What do you think should be in the empty cells? Zero? NA? Empty strings? There can't just be nothing...
Here's an example with empty strings "" as the filler element - but do consider carefully what Duncan wrote.
test <- data.frame(first=c(1,2), second=c(3,4))
typeof(test[1,1]) # double
# rbind() a vector that repeats the "empty" element one-less-then-ncols() times, # and has the column sum as its last element.
test <- rbind(test, c(rep("", ncol(test)-1), sum(test$second))) test
first second
1 1 3
2 2 4
3 7
# but...!
typeof(test[1,1]) # character!
typeof(test[2,2]) # also character!
By adding characters to your columns, you cast all of your data into character type!
If you want to *do* anything with the number, you'll need to cast it back to numeric.
Or use 0 or NA as the filler element.
test <- rbind(test, c(rep(NA, ncol(test)-1), sum(test$second)))
But anyway ... as others have said, you may want to reconsider the logic of your approach.
B.
On Nov 30, 2014, at 8:45 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:
> Hi Boris,
>
> Sorry for not being explicit when replying to your first email. I wanted to say it does not work when row-binding. I want the following output. Thanks, Pradip
>
>
> 1 1 3
> 2 2 4
> Total 7
>
> ################### Below is the console ##########
>> test <- data.frame(first=c(1,2), second=c(3,4)) test
> first second
> 1 1 3
> 2 2 4
>>
>> sum(test$second)
> [1] 7
>>
>> rbind(test, sum(test$second))
> first second
> 1 1 3
> 2 2 4
> 3 7 7
>
> Pradip K. Muhuri, PhD
> SAMHSA/CBHSQ
> 1 Choke Cherry Road, Room 2-1071
> Rockville, MD 20857
> Tel: 240-276-1070
> Fax: 240-276-1260
>
> -----Original Message-----
> From: Boris Steipe [mailto:boris.steipe at utoronto.ca]
> Sent: Sunday, November 30, 2014 5:51 PM
> To: Muhuri, Pradip (SAMHSA/CBHSQ)
> Cc: r-help at r-project.org
> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect
> column total
>
> No it doesn't ...
> consider:
>
> test <- data.frame(first=c(1,2), second=c(3,4)) test first second
> 1 1 3
> 2 2 4
>
> sum(test$second)
> [1] 7
>
>
>
>
> On Nov 30, 2014, at 3:48 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:
>
>> Hi Boris,
>>
>> That gives me the total for each of the 6 columns of the data frame. I want the column sum just for the last column.
>>
>> Thanks,
>>
>> Pradip Muhuri
>>
>>
>>
>> -----Original Message-----
>> From: Boris Steipe [mailto:boris.steipe at utoronto.ca]
>> Sent: Sunday, November 30, 2014 12:50 PM
>> To: Muhuri, Pradip (SAMHSA/CBHSQ)
>> Cc: r-help at r-project.org
>> Subject: Re: [R] R dplyr solution vs. Base R solution for the slect
>> column total
>>
>> try:
>>
>> sum(test$count)
>>
>>
>> B.
>>
>>
>> On Nov 30, 2014, at 12:01 PM, Muhuri, Pradip (SAMHSA/CBHSQ) <Pradip.Muhuri at samhsa.hhs.gov> wrote:
>>
>>> Hello,
>>>
>>> I am looking for a dplyr or base R solution for the column total - JUST FOR THE LAST COLUMN in the example below. The following code works, giving me the total for each column - This is not exactly what I want.
>>> rbind(test, colSums(test))
>>>
>>> I only want the total for the very last column. I am struggling
>>> with this part of the code: rbind(test, c("Total", colSums(test, ...))) I have searched for a solution on Stack Oveflow. I found some mutate() code for the cumsum but no luck for the select column total. Is there a dplyr solution for the select column total?
>>>
>>> Any hints will be appreciated.
>>>
>>> Thanks,
>>>
>>> Pradip Muhuri
>>>
>>>
>>> ####### The following is from the console - the R script with reproducible example is also appended.
>>>
>>>
>>> mrjflag cocflag inhflag halflag oidflag count
>>> 1 0 0 0 0 0 256
>>> 2 0 0 0 1 1 256
>>> 3 0 0 1 0 1 256
>>> 4 0 0 1 1 1 256
>>> 5 0 1 0 0 1 256
>>> 6 0 1 0 1 1 256
>>> 7 0 1 1 0 1 256
>>> 8 0 1 1 1 1 256
>>> 9 1 0 0 0 1 256
>>> 10 1 0 0 1 1 256
>>> 11 1 0 1 0 1 256
>>> 12 1 0 1 1 1 256
>>> 13 1 1 0 0 1 256
>>> 14 1 1 0 1 1 256
>>> 15 1 1 1 0 1 256
>>> 16 1 1 1 1 1 256
>>> 17 8 8 8 8 15 4096
>>>
>>>
>>>
>>> ####################### below is the reproducible example
>>> ########################
>>> library(dplyr)
>>> # generate data
>>> dlist <- rep( list( 0:1 ), 4 )
>>> data <- do.call(expand.grid, drbind) data$id <- 1:nrow(data)
>>> names(data) <- c('mrjflag', 'cocflag', 'inhflag', 'halflag')
>>>
>>>
>>> # mutate a column and then sumamrize test <- data %>%
>>> mutate(oidflag= ifelse(mrjflag==1 | cocflag==1 | inhflag==1 | halflag==1, 1, 0)) %>%
>>> group_by(mrjflag,cocflag, inhflag, halflag, oidflag) %>%
>>> summarise(count=n()) %>%
>>> arrange(mrjflag,cocflag, inhflag, halflag, oidflag)
>>>
>>>
>>> # This works, giving me the total for each column - This is not what I exactly want.
>>> rbind(test, colSums(test))
>>>
>>> # I only want the total for the very last column rbind(test,
>>> c("Total", colSums(test, ...)))
>>>
>>> Pradip K. Muhuri, PhD
>>> SAMHSA/CBHSQ
>>> 1 Choke Cherry Road, Room 2-1071
>>> Rockville, MD 20857
>>> Tel: 240-276-1070
>>> Fax: 240-276-1260
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
More information about the R-help
mailing list