[R] [EXTERNAL] RE: I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Sat Nov 4 06:08:03 CET 2023


To be fair, Jordan, I think R has some optimizations so that the arguments
in some cases are NOT evaluated until needed. So only one or the other
choice ever gets evaluated for each row. My suggestion merely has
typographic implications and some aspects of clarity and minor amounts of
less memory and parsing needed. 
 
But ifelse() is currently implemented somewhat too complexly for my taste.
Just type "ifelse" at the prompt and you will see many lines of code that
handle various scenarios.
 
If you KNOW you have a certain situation such as a data.frame with multiple
rows and are sure a simpler solution works, there may well be faster ways to
do this. Obviously you could write a function that can be called once per
line and returns the answer, or a vectorized version that returns a vector
of 65 and 58 entries. Or you could add a few lines of code creating a
vector, perhaps as a temporary new column that looks like:
 
logic_male <- df$G == "male"
 
age[logic_male] <- 65
age(!logic_male] <- 58
 
Then use the age column in a formula directly as it contains the part of the
ifelse needed. You can then delete "age" whether stand-alone or as a column.
 
What is more efficient depends on your data.
 
Do note though that an advantage of using ifelse() is when you have nested
conditions which cannot trivially be written out along the lines above, but
I find that sometimes such nested expressions may be easier to read using
other techniques such as the dplyr function case_when().
Here is an example of code where some entries are NA or not categorized:
 
library(tidyverse)
WC <- 100
TG <- 2
Gender <- c("male", "female", "no comment", NA, "female")
 
result <- TG * (WC -
                  case_when(
                    is.na(Gender) ~NA,
                    Gender == "male" ~ 65,
                    Gender == "female" ~ 58,
                    .default  = NA
                  ))
 
The result for the above example is a result:
 
> result
[1] 70 84 NA NA 84
 
 
If you later want to add categories such as "transgender" with a value of 61
or have other numbers for groups like "Hispanic male", you can amend the
instructions as long as you put your conditions in an order so that they are
tried until one of them matches, or it takes the default. Yes, in a sense
the above is doable using a deeply nested ifelse() but easier for me to read
and write and evaluate. It may not be more efficient or may be as some of
dplyr is compiled code.
 
Please note some here prefer discussions about base-R functionality and some
have qualms about the tidyverse for various reasons. I don't and find much
of their functionality more easy to use.
 
 
 
From: Jorgen Harmse <JHarmse using roku.com> 
Sent: Friday, November 3, 2023 6:27 PM
To: avi.e.gross using gmail.com; r-help using r-project.org; mkzaman.m using gmail.com
Subject: Re: [EXTERNAL] RE: [R] I need to create new variables based on two
numeric variables and one dichotomize conditional category variables.
 
Yes, that will halve the number of multiplications.
 
If you're looking for such optimisations then you can also consider
ifelse(G=='male', 65L, 58L). That will definitely use less time & memory if
WC is integer, but the trade-offs are more complicated if WC is floating
point.
 
Regards,
Jorgen Harmse.


 
From: avi.e.gross using gmail.com <mailto:avi.e.gross using gmail.com>
<avi.e.gross using gmail.com <mailto:avi.e.gross using gmail.com> >
Date: Friday, November 3, 2023 at 16:12
To: Jorgen Harmse <JHarmse using roku.com <mailto:JHarmse using roku.com> >,
r-help using r-project.org <mailto:r-help using r-project.org>  <r-help using r-project.org
<mailto:r-help using r-project.org> >, mkzaman.m using gmail.com
<mailto:mkzaman.m using gmail.com>  <mkzaman.m using gmail.com
<mailto:mkzaman.m using gmail.com> >
Subject: [EXTERNAL] RE: [R] I need to create new variables based on two
numeric variables and one dichotomize conditional category variables.
Just a minor point in the suggested solution:

df$LAP <- with(df, ifelse(G=='male', (WC-65)*TG, (WC-58)*TG))

since WC and TG are not conditional, would this be a slight improvement?

df$LAP <- with(df, TG*(WC - ifelse(G=='male', 65, 58)))



-----Original Message-----
From: R-help <r-help-bounces using r-project.org
<mailto:r-help-bounces using r-project.org> > On Behalf Of Jorgen Harmse via
R-help
Sent: Friday, November 3, 2023 11:56 AM
To: r-help using r-project.org <mailto:r-help using r-project.org> ; mkzaman.m using gmail.com
<mailto:mkzaman.m using gmail.com> 
Subject: Re: [R] I need to create new variables based on two numeric
variables and one dichotomize conditional category variables.

df$LAP <- with(df, ifelse(G=='male', (WC-65)*TG, (WC-58)*TG))

That will do both calculations and merge the two vectors appropriately. It
will use extra memory, but it should be much faster than a 'for' loop.

Regards,
Jorgen Harmse.

------------------------------

Message: 8
Date: Fri, 3 Nov 2023 11:10:49 +1030
From: "Md. Kamruzzaman" <mkzaman.m using gmail.com <mailto:mkzaman.m using gmail.com> >
To: r-help using r-project.org <mailto:r-help using r-project.org> 
Subject: [R] I need to create new variables based on two numeric
        variables and one dichotomize conditional category variables.
Message-ID:
        <CAGbxoeGjsxZKQ6qijEMq-X-5doqnQQS1jjPDDrGT6hH5xWqOKQ using mail.gmail.com
<mailto:CAGbxoeGjsxZKQ6qijEMq-X-5doqnQQS1jjPDDrGT6hH5xWqOKQ using mail.gmail.com>
>
Content-Type: text/plain; charset="utf-8"

Hello Everyone,
I have three variables: Waist circumference (WC), serum triglyceride (TG)
level and gender. Waist circumference and serum triglyceride is numeric and
gender (male and female) is categorical. From these three variables, I want
to calculate the "Lipid Accumulation Product (LAP) Index". The equation to
calculate LAP is different for male and females. I am giving both equations
below.

LAP for male = (WC-65)*TG
LAP for female = (WC-58)*TG

My question is 'how can I calculate the LAP and create a single new column?

Your cooperation will be highly appreciated.

Thanks in advance.

With Regards

*--------------------------------*

*Md Kamruzzaman*

*PhD **Research Fellow (**Medicine**)*
Discipline of Medicine and Centre of Research Excellence in Translating
Nutritional Science to Good Health
Adelaide Medical School | Faculty of Health and Medical Sciences
The University of Adelaide
Adelaide SA 5005

        [[alternative HTML version deleted]]



        [[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org <mailto:R-help using r-project.org>  mailing list -- To
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list