[R] Odd behavior of a function within apply

Erin Hodgess er|nm@hodge@@ @end|ng |rom gm@||@com
Tue Aug 9 19:04:11 CEST 2022


Avi, that’s great!

Thanks

On Tue, Aug 9, 2022 at 12:56 PM <avi.e.gross using gmail.com> wrote:

> Yes, David, the function described seems to insist it be of type integer
> or type character and if the type was double or others might well fail as y
> would never be initialized.
>
> The goal seems to be to count how many "missing" values are found as in NA
> if a numeric type or an empty string if character.
>
> But you can have some form of NA in all kinds of object types including
> character as in this construct:
>
> > x <- c("a", NA, "", "b", "NA)")
> > x
> [1] "a"   NA    ""    "b"   "NA)"
>
> The above has three useless elements if both NA and "" are considered
> empty. So logically the condition could be to count NA and IF it is of type
> character, also count "".
>
> So rather than play games testing not just is.integer, is.double (or just
> is.numeric) as well as is.logical and is.raw, all the above can be tested
> with is.na() first to add up how many Na they contain. If then it is of
> type character, you can add any blank strings.
>
> So the algorithm would initialize y to sum(is.na(vec)) and then if the
> vec is character, add the sum of how many empty strings.
>
> Alternately, the function should deal with what it wants to do if any
> other type is encountered. You can internally converts many things to
> integer or character and then operate on them. Or you can return a zero or
> raise an alarm when given something else.
>
> In this case, simply setting y to zero before using it would make it
> defined and avoid the error, albeit report nothing found if it was a double
> or Boolean vector even if it did contain NA.
>
>
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of David Carlson
> via R-help
> Sent: Tuesday, August 9, 2022 11:33 AM
> To: Erin Hodgess <erinm.hodgess using gmail.com>
> Cc: r-help using r-project.org
> Subject: Re: [R] Odd behavior of a function within apply
>
> Could you have columns that are not character or integer so that y is
> never defined in the function?
>
> count1a(1:5/3)
> Error in count1a(1:5/3) : object 'y' not found
>
> David Carlson
>
>
> On Mon, Aug 8, 2022 at 1:35 PM Erin Hodgess <erinm.hodgess using gmail.com>
> wrote:
>
> > OK.⁠​ I'm back again.⁠​ So my test1.⁠​df is 236x390 If I put in the
> > following:⁠​ lapply(test1.⁠​df,count1a) Error in FUN(X[[i]],
> > .⁠​.⁠​.⁠​) :⁠​ object 'y' not found > lapply(test1.⁠​df,count1a) Error
> > in FUN(X[[i]],
> > .⁠​.⁠​.⁠​) :⁠​ object 'y' not found > sapply(test1.⁠​df,count1a)
> > ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This
> > message came from outside your organization.
> >
> > ZjQcmQRYFpfptBannerEnd
> >
> > OK.  I'm back again.
> >
> > So my test1.df is 236x390
> >
> > If I put in the following:
> >  lapply(test1.df,count1a)
> > Error in FUN(X[[i]], ...) : object 'y' not found
> > > lapply(test1.df,count1a)
> > Error in FUN(X[[i]], ...) : object 'y' not found
> > > sapply(test1.df,count1a)
> > Error in FUN(X[[i]], ...) : object 'y' not found
> > >
> > What am I doing wrong, please?
> > Thanks,
> > Erin
> >
> >
> > Erin Hodgess, PhD
> > mailto: erinm.hodgess using gmail.com
> >
> >
> > On Mon, Aug 8, 2022 at 1:41 PM Erin Hodgess <erinm.hodgess using gmail.com>
> wrote:
> >
> > > Awesome, thanks so much!!
> > >
> > > Erin Hodgess, PhD
> > > mailto: erinm.hodgess using gmail.com
> > >
> > >
> > > On Mon, Aug 8, 2022 at 1:38 PM John Fox <jfox using mcmaster.ca> wrote:
> > >
> > >> Dear Erin,
> > >>
> > >> The problem is that the data frame gets coerced to a character
> > >> matrix, and the only column with "" entries is the 9th (the second
> > >> one you
> > >> supplied):
> > >>
> > >> as.matrix(test1.df)
> > >>     X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr
> > >> 1  "48160"   "December"    "2014"
> > >> 2  "48198"   "June"        "2018"
> > >> 3  "80027"   "August"      "2016"
> > >> 4  "48161"   ""            NA
> > >> 5  NA        ""            NA
> > >> 6  "48911"   "August"      "1985"
> > >> 7  NA        "April"       "2019"
> > >> 8  "48197"   "February"    "1993"
> > >> 9  "48021"   ""            NA
> > >> 10 "11355"   "December"    "1990"
> > >>
> > >> (Here, test1.df only contains the three columns you provided.)
> > >>
> > >> A solution is to use sapply:
> > >>
> > >>  > sapply(test1.df, count1a)
> > >>      X1_1_HZP1 X1_1_HBM1_mon  X1_1_HBM1_yr
> > >>              2             3             3
> > >>
> > >>
> > >> I hope this helps,
> > >>   John
> > >>
> > >>
> > >> On 2022-08-08 1:22 p.m., Erin Hodgess wrote:
> > >> > Hello!
> > >> >
> > >> > I have the following data.frame
> > >> >   dput(test1.df[1:10,8:10])
> > >> > structure(list(X1_1_HZP1 = c(48160L, 48198L, 80027L, 48161L, NA,
> > >> > 48911L, NA, 48197L, 48021L, 11355L), X1_1_HBM1_mon =
> > >> > c("December", "June", "August", "", "", "August", "April",
> > >> > "February", "", "December"), X1_1_HBM1_yr = c(2014L, 2018L,
> > >> > 2016L, NA, NA, 1985L, 2019L, 1993L, NA, 1990L)), row.names =
> > >> > c(NA, 10L), class = "data.frame")
> > >> >
> > >> > And the following function:
> > >> >> dput(count1a)
> > >> > function (x)
> > >> > {
> > >> >      if (typeof(x) == "integer")
> > >> >          y <- sum(is.na(x))
> > >> >      if (typeof(x) == "character")
> > >> >          y <- sum(x == "")
> > >> >      return(y)
> > >> > }
> > >> > When I use the apply function with count1a, I get the following:
> > >> >   apply(test1.df[1:10,8:10],2,count1a)
> > >> >      X1_1_HZP1 X1_1_HBM1_mon  X1_1_HBM1_yr
> > >> >             NA             3            NA
> > >> > However, when I do use columns 8 and 10, I get the correct response:
> > >> >   apply(test1.df[1:10,c(8,10)],2,count1a)
> > >> >     X1_1_HZP1 X1_1_HBM1_yr
> > >> >             2            3
> > >> >>
> > >> > I am really baffled.  If I use count1a on a single column, it
> > >> > works
> > >> fine.
> > >> >
> > >> > Any suggestions much appreciated.
> > >> > Thanks,
> > >> > Sincerely,
> > >> > Erin
> > >> >
> > >> >
> > >> > Erin Hodgess, PhD
> > >> > mailto: erinm.hodgess using gmail.com
> > >> >
> > >> >       [[alternative HTML version deleted]]
> > >> >
> > >> > ______________________________________________
> > >> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo
> > >> > /r-help__;!!KwNVnqRv!CHx9JKnbOObpAt0LltEogLSxDUEl9qJDI6FgqMJBG_kd
> > >> > RHAy8SJJdx6Uq0p4rpBa4E3DkmQ65UImH48MBvSbrfE$
> > >> > PLEASE do read the posting guide
> > >> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.
> > >> html__;!!KwNVnqRv!CHx9JKnbOObpAt0LltEogLSxDUEl9qJDI6FgqMJBG_kdRHAy8
> > >> SJJdx6Uq0p4rpBa4E3DkmQ65UImH48MdYOqruE$
> > >> > and provide commented, minimal, self-contained, reproducible code.
> > >> --
> > >> John Fox, Professor Emeritus
> > >> McMaster University
> > >> Hamilton, Ontario, Canada
> > >> web:
> > >> https://urldefense.com/v3/__https://socialsciences.mcmaster.ca/jfox
> > >> /__;!!KwNVnqRv!CHx9JKnbOObpAt0LltEogLSxDUEl9qJDI6FgqMJBG_kdRHAy8SJJ
> > >> dx6Uq0p4rpBa4E3DkmQ65UImH48MRU4wu3o$
> > >>
> > >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________R-help using r-project.org
> > mailing list -- To UNSUBSCRIBE and more,
> > seehttps://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r
> > -help__;!!KwNVnqRv!CHx9JKnbOObpAt0LltEogLSxDUEl9qJDI6FgqMJBG_kdRHAy8SJ
> > Jdx6Uq0p4rpBa4E3DkmQ65UImH48MBvSbrfE$
> > PLEASE do read the posting guide
> > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.htm
> > l__;!!KwNVnqRv!CHx9JKnbOObpAt0LltEogLSxDUEl9qJDI6FgqMJBG_kdRHAy8SJJdx6
> > Uq0p4rpBa4E3DkmQ65UImH48MdYOqruE$ and provide commented, minimal,
> > self-contained, reproducible code.
> >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Erin Hodgess, PhD
mailto: erinm.hodgess using gmail.com

	[[alternative HTML version deleted]]



More information about the R-help mailing list