[R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Fri Jan 28 19:30:39 CET 2022


Javed,
Your explanation allows many other ways to look at the problem. 

Some of them skip steps and get to the point faster. Of course, I do not know what exactly you mean by the "fairness object" other than guessing it does an evaluation of what you supply and lets you know if it is fair.

For something categorical like gender it used to be easy to use the table() function to show how many of each category you have. Of course, it now seems that old assumptions about two genders are being replaced by additional choices so it may literally be nonbinary.

Your code looked for 'T14' which gives no clue about purpose. Here is an example where I coded the words "male" and "female" in a small sample for illustration. You can leave the data as is and have it automatically count or take percentages and then extract whatever you want and use it to make decisions.

The darn HTML stripper this list uses makes showing code hard, so I have to disperse it with extra spacing.
Here is some data:
gender <- c("male", "female", "female", "male", "female", "female", "female")

I made it lopsided and you can see the counts easily enough with:

   tab.cnt <- table(gender)


The output is:
> tab.cnt
genderfemale   male      5      2 

You can of course get percentages using the table object:

   tab.prcnt <- prop.table(tab.cnt)

The output is:

> tab.prcnt
gender   female      male 0.7142857 0.2857143 
You can, of course, multiply the above by a hundred and use round() to trim it to fewer digits, but what you can do is extract the numbers to do things like a comparison:
Consider deciding that more than 60% females is too much:

if (tab.prcnt[["female"]] > 0.6)  print("too many women")

Your criteria may of course be more complicated, but the thing I am teaching is that there are built-in methods that may be used as you get to know not only the language but techniques that work well with it. Your need may work well with your technique of converting your data representation from one form to a numeric form. Realistically, many might simply use another built-in feature called factors. Converting my data to a factor does this:

> fact <- factor(gender)> fact[1] male   female female male   female female femaleLevels: female male> as.numeric(fact)[1] 2 1 1 2 1 1 1

The default is to use integers starting with 1 but you can change that in many ways, or in the above, simply subtract 1 to get what you want. To get the percentage of men in the above, can be something like this:

> mean(as.numeric(fact) - 1)[1] 0.2857143

You may get lots of advice on many methods and ways to do things but pick what fits your situation and sometimes you can try to change the situation. For some purposes, categorical data needs to be transformed for proper use in something like machine learning algorithms but sometimes it can be left alone as shown above and the statistics can be worked with. 
From: javed khan <javedbtk111 using gmail.com>
To: Avi Gross <avigross using verizon.net>
Cc: r-help using r-project.org <r-help using r-project.org>
Sent: Fri, Jan 28, 2022 8:34 am
Subject: Re: Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

Avi Gross, thanks for your reply. 
I have no interest of using the zero and one in my code, I mean true false can also be ok because I don't have to do some arithmetic with it. 
I just want to pass a protected variable and one of its (privileged) value to the fairness object to see if the model has any bias towards the unprivileged values of the protected variable. 
You can consider my protected variable as Sex and it's values as male and female. I want the fairness object to see if there is any bias towards the female group which could be considered as an unprivileged group. 


Thanks

On Thursday, January 27, 2022, Avi Gross via R-help <r-help using r-project.org> wrote:


Javed,
You may misunderstand something here.
Forget ifelse() which does all kinds of things (which you can see by just typing "ifelse" and a carriage/return or ENTER.
Your initial goal should be kept in mind. You want to create a data structure, in this case a vector, that is the same length as another vector called test$operator in which you mark whether the corresponding element was exactly "T13" or not.
There is nothing fundamentally wrong with your approach albeit it is overkill in this case. As has been pointed out, SKIPPING ifelse() entirely, you can get a vector of Logicals (TRUE or FALSE) by a simple command like this:
    result <- test$operator == 'T13'
For many purposes, that is all you need. TRUE and FALSE are also sometimes mapped into 1 and 0 for various purposes, so you can convert them into integers or general numerics is that is needed. Consider the following code that checks the integers from 1 to 7 to see if they are even (as in divisible by 2):

> result <- 1:7 %% 2 == 0> result[1] FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE> as.integer(result)[1] 0 1 0 1 0 1 0> as.numeric(result)[1] 0 1 0 1 0 1 0> result <- as.integer(1:7 %% 2 == 0)> result[1] 0 1 0 1 0 1 0

If for some reason the choice of 1 and 0 is the opposite of what you need, you can invert them several ways with the simplest being:

    as.integer(1:7 %% 2 != 0)
or    as.integer(!(1:7 %% 2 != 0))

The first negates the comparison and the second just flips every FALSE and TRUE to the other.
Why are we talking about this? For many more interesting cases, ifelse() is great as you can replace one or both of the choices with anything. A very common case is replacing one choice with itself and changing the other, or nesting the comparisons in a sort of simulated tree as in 
    ifelse(some_condition,       ifelse(second_condition, result1, result2),         ifelse(third_condition, result3, result4)))

But you seem to want the simplest return of two values that also happen to be the underlying equivalent of TRUE and FALSE in many languages. In Python, anything that evaluates to zero (or the Boolean value FALSE) tends to be treated as FALSE, and anything else like a 1 or 666 is treated as TRUE, as shown below:

> if (TRUE) print("TRUE") else print("FALSE")[1] "TRUE"> if (1) print("TRUE") else print("FALSE")[1] "TRUE"> if (666) print("TRUE") else print("FALSE")[1] "TRUE"> if (FALSE) print("TRUE") else print("FALSE")[1] "FALSE"> if (0) print("TRUE") else print("FALSE")[1] "FALSE"

This is why you are being told that for many purposes, the Boolean vector may work fine. But if you really want or need zero and one, that is a trivial transformation as shown. Feel free to use ifelse() and then figure out what went wrong with your code, but also to try the simpler version and see if the problem goes away.
Avi
-----Original Message-----
From: javed khan <javedbtk111 using gmail.com>
To: Bert Gunter <bgunter.4567 using gmail.com>
Cc: R-help <r-help using r-project.org>
Sent: Thu, Jan 27, 2022 1:15 pm
Subject: Re: [R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed

Thank you Bert Gunter

Do you mean I should do something like this:

prot <- (as.numeric(ifelse(test$ operator == 'T13', 1, 0))





	[[alternative HTML version deleted]]



More information about the R-help mailing list