[R] Creating a simple function

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Sat Sep 21 15:05:12 CEST 2019


Your use of subset instead of select does not help, but a corrected example does indeed confirm your point.

library(dplyr)

str(data.frame(a=c(1,1,2,2), b=1:4) %>% select(b,a))
## 'data.frame':	4 obs. of 2 variables: 
## $ b: int 1 2 3 4
## $ a: num 1 1 2 2

However the `[` issue is still worth addressing. If that does not fix the problem then a dput(head(troublesomedata)) from Zachary will be needed to figure out what actually is going on.

On September 21, 2019 5:22:07 AM PDT, Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>On 21/09/2019 7:38 a.m., Jeff Newmiller wrote:
>> The dplyr::select function returns a special variety of data.frame
>called a tibble. 
>
>I don't think that's always true.  The docs say it returns "An object
>of 
>the same class as .data.", and that's what I'm seeing:
>
> > str(data.frame(a=c(1,1,2,2), b=1:4) %>% subset(a == 1))
>'data.frame':	2 obs. of  2 variables:
>  $ a: num  1 1
>  $ b: int  1 2
>
>But I believe there are other dplyr functions that take dataframes as 
>input and return tibbles, I just don't know which ones.
>
>Duncan Murdoch
>
>The tibble has certain features designed to make it behave consistently
>
>when indexing is used. Specifically, the `[` operator always returns a 
>tibble regardless of how many columns are indicated by the column
>index. 
>This is unlike the conventional data frame which returns a vector when 
>exactly one column is indicated by the column index, or a data.frame if
>
>more than one is indicated.
>> 
>> A syntax that consistently yields a column vector with both tibbles
>and data.frames is
>> 
>> dta[[ 1 ]]
>> 
>> so
>> 
>> ctab <- function(data) {
>>     CrossTable(data[[1]], data[[2]], prop.chisq = FALSE, prop.c =
>FALSE,
>> prop.t = FALSE, format = "SPSS")
>> }
>> 
>> should work.
>> 
>> On September 20, 2019 10:59:46 AM PDT, Duncan Murdoch
><murdoch.duncan using gmail.com> wrote:
>>> On 20/09/2019 11:30 a.m., Zachary Lim wrote:
>>>> Hi,
>>>>
>>>> I'm trying to create a simple function that takes a dataframe as
>its
>>> only argument. I've been using gmodels::CrossTable, but it requires
>a
>>> lot of arguments, e.g.:
>>>>
>>>> #this runs fine
>>>> CrossTable(data$col1, data$col2, prop.chisq = FALSE, prop.c =
>FALSE,
>>> prop.t = FALSE, format = "SPSS")
>>>>
>>>> Moreover, I wanted to make it compatible with piping, so I decided
>to
>>> create the following function:
>>>>
>>>> ctab <- function(data) {
>>>>     CrossTable(data[,1], data[,2], prop.chisq = FALSE, prop.c =
>FALSE,
>>> prop.t = FALSE, format = "SPSS")
>>>> }
>>>>
>>>> When I try to use this function, however, I get the following
>error:
>>>>
>>>> #this results in 'Error: Must use a vector in `[`, not an object of
>>> class matrix.'
>>>> data %>% select(col1, col2) %>% ctab()
>>>>
>>>> I tried searching online but couldn't find much about that error
>>> (except for in specific and unrelated cases). Moreover, when I
>created
>>> a very simple dataset, it turns out there's no problem:
>>>>
>>>> #this runs fine
>>>> data.frame(C1 = c('x','y','x','y'), C2 = c('a','a','b','b')) %>%
>>> ctab()
>>>>
>>>>
>>>> Is this a problem with my function or the data? If it's the data,
>why
>>> does directly calling CrossTable work?
>>>
>>> Presumably  data %>% select(col1, col2)  isn't giving you a
>dataframe.
>>> However, you haven't given us a reproducible example, so I can't
>tell
>>> you what it's doing.  But that's where you should look.
>>>
>>> Duncan Murdoch
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list