[R] return value of {....}

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Mon Jan 16 04:53:17 CET 2023


Again, John, we are comparing different designs in languages that are often
decades old and partially retrofitted selectively over the years.

Is it poor form to use global variables? Many think so. Discussions have
been had on how to use variables hidden in various ways that are not global,
such as within a package.

But note R still has global assignment operators like <<- and its partner
->> that explicitly can even create a global variable that did not exist
before the function began and that persists for that session. This is
perhaps a special case of the assign() function which can do the same for
any designated environment.

Although it may sometimes be better form to avoid things like this, it can
also be worse form when you want to do something decentralized with no
control over passing additional arguments to all kinds of functions.  

Some languages try to finesse their way past this by creating concepts like
closures that hold values and can even change the values without them being
globally visible. Some may use singleton objects or variables that are part
of a class rather than a single object (which is again a singleton.)

So is the way R allows really a bad thing, especially if rarely used?

All I know is MANY languages use scoping including functions that declare a
variable then create an inner function or many and return the inner
function(s) to the outside where the one getting it can later use that
function and access the variable and even use it as a way to communicate
with the multiple functions it got that were created in that incubator.
Nifty stuff but arguably not always as easy to comprehend!

This forum is not intended for BASHING any language, certainly not R. There
are many other languages to choose from and every one of them will have some
things others consider flaws. How many opted out of say a ++ operator as in
y = x++ for fairly good reasons and later added something like the Walrus
operator so you can now write y = (x := x + 1) as a way to do the same thing
and other things besides?

But to address your point, about a variable outside a function as defined in
a set of environments to search that includes a global context, I want to
note that it is just a set of maskings and your variable "x" can appear in
EVERY environment above you and you can get in real trouble if the order the
environments are put in place changes in some way. The arguably safer way
would be to get a specific value of x would be to not ask for it directly
but as get("x", envir=...) and specify the specific environment that ideally
is in existence at that time. Other arguments to get() let you specify a few
more things such as whether to search other places or supply a default.

Is it then poor technique to re-use the same name "x" in the same code for
some independent use? Probably, albeit if the new value plays a similar
role, just in another stretch of code, maybe not. I would comment it
carefully, and spell that out.

S first came out and before I even decided to become a Computer Scientist in
the mid to late  70's  and evolved multiple times. I first noticed it at
Bell Labs in the 80's. To a certain extent, R started as heavily influenced
by S and many programs could run on both. It too has changed over about
three decades. What kind of perfection can anyone expect over more recent
languages carefully produced with little concern about backward
compatibility?

And it remains very useful not necessarily based on the original language or
even the evolving core, but because of changes that maintained compatibility
as well as so many packages that met needs. Making changes, even if
"improvements" is likely to break all kinds of code unless it is something
like the new pipe operator that simply uses aspects that nobody would have
used before such as the new "|>" operator. 

The answer to too many questions about R remains BECAUSE that is how it was
done and whether you like it or not, may not change much any time soon. That
is why so many people like packages such as in the tidyverse because they
manage to make some changes, for better and often for verse.



-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Sorkin, John
Sent: Sunday, January 15, 2023 8:08 PM
To: Richard O'Keefe <raoknz using gmail.com>; Valentin Petzel <valentin using petzel.at>
Cc: R help Mailing list <r-help using r-project.org>
Subject: Re: [R] return value of {....}

Richard,
I sent my prior email too quickly:

A slight addition to your code shows an important aspect of R, local vs.
global variables:

x <- 137
f <- function () {
       a <- x
       x <- 42
       b <- x
       list(a=a, b=b)
   }
 f()
print(x)

When run the program produces the following:

> x <- 137
> f <- function () {
+        a <- x
+        x <- 42
+        b <- x
+        list(a=a, b=b)
+    }
>  f()
$a
[1] 137

$b
[1] 42

> print(x)
[1] 137

The fist x, a <- x, invokes an x variable that is GLOBAL. It is known both
inside and outside the function.
The second x, x <- 42, defines an x that is LOCAL to the function, it is not
known to the program that called the function. The LOCAL value of x is used
in the expression  b <- x. As can be seen by the print(x) statement, the
LOCAL value of x is NOT known by the program that calls the function. The
class of a variable, scoping (i.e. local vs. variable) can be a source of
subtle programming errors. A general recommendation is to AVOID use of a
global variable in a function, i.e. don't use a variable in function that is
not passed as a parameter to the function (as was done in the function above
in the statment a <- x). If you need to use a variable in a function that is
known by the program that calls the function, pass the variable as a
argument to the function e.g. 

Use this code:

# Set values needed by function
y <- 2
b <- 30

myfunction <- function(a,b){
cat("a=",a,"b=",b,"\n")
  y <- a
  y2 <- y+b
  cat("y=",y,"y2=",y2,"\n")
}
# Call the function and pass all needed values to the function
myfunction(y,b)
 
Don't use the following code that depends on a global value that is known to
the function, but not passed as a parameter to the function:

y <- 2
myNGfunction <- function(a){
  cat("a=",a,"b=",b,"\n")
  y <- a
  y2 <- y+b
  cat("y=",y,"y2=",y2,"\n")
}
# b is a global variable and will be know to the function, # but should be
passed as a parameter as in example above.
b <- 100
myNGfunction(y)

John

________________________________________
From: R-help <r-help-bounces using r-project.org> on behalf of Sorkin, John
<jsorkin using som.umaryland.edu>
Sent: Sunday, January 15, 2023 7:40 PM
To: Richard O'Keefe; Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {....}

Richard,
A slight addition to your code shows an important aspect of R, local vs.
global variables:

x <- 137
f <- function () {
       a <- x
       x <- 42
       b <- x
       list(a=a, b=b)
   }
 f()
print(x)

________________________________________
From: R-help <r-help-bounces using r-project.org> on behalf of Richard O'Keefe
<raoknz using gmail.com>
Sent: Sunday, January 15, 2023 6:39 PM
To: Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {....}

I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect), and in Scheme,
(begin (define x ...) (define y ...) ...) declares variables x and y that
are local to the (begin ...) form, just like Algol 68.  That's weirdness 1.
Javascript had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the existence of
variables is indifferent to the presence of curly braces, it's that it's
*dynamic*.  In f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the function.  It
has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second occurrence of "use x"
refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+     a <- x
+     x <- 42
+     b <- x
+     list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was the issue
that finally sank my attempt.  It's not whether the occurrence of "use x" is
*lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it arriving at
its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an outer
variable or a local variable depends on whether an assignment creating that
local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is exactly what
you expect, congratulations, you understand local variables in R.

> x <- 0
> g <- function () {
+     n <- 10
+     r <- numeric(n)
+     for (i in 1:n) {
+         if (i == 6) x <- 100
+         r[i] <- x + i
+     }
+     r
+ }
> g()
 [1]   1   2   3   4   5 106 107 108 109 110


On Fri, 13 Jan 2023 at 23:28, Valentin Petzel <valentin using petzel.at> wrote:

> Hello Akshay,
>
> R is quite inspired by LISP, where this is a common thing. It is not 
> in fact that {...} returned something, rather any expression 
> evalulates to some value, and for a compound statement that is the 
> last evaluated expression.
>
> {...} might be seen as similar to LISPs (begin ...).
>
> Now this is a very different thing compared to {...} in something like 
> C, even if it looks or behaves similarly. But in R {...} is in fact an 
> expression and thus has evaluate to some value. This also comes with 
> some nice benefits.
>
> You do not need to use {...} for anything that is a single statement. 
> But you can in each possible place use {...} to turn multiple 
> statements into one.
>
> Now think about a statement like this
>
> f <- function(n) {
> x <- runif(n)
> x**2
> }
>
> Then we can do
>
> y <- f(10)
>
> Now, you suggested way would look like this:
>
> f <- function(n) {
> x <- runif(n)
> y <- x**2
> }
>
> And we'd need to do something like:
>
> f(10)
> y <- somehow_get_last_env_of_f$y
>
> So having a compound statement evaluate to a value clearly has a benefit.
>
> Best Regards,
> Valentin
>
> 09.01.2023 18:05:58 akshay kulkarni <akshay_e4 using hotmail.com>:
>
> > Dear Valentin,
> >                           But why should {....} "return" a value? It
> could just as well evaluate all the expressions and store the 
> resulting objects in whatever environment the interpreter chooses, and 
> then it would be left to the user to manipulate any object he chooses. 
> Don't you think returning the last, or any value, is redundant? We are 
> living in the 21st century world, and the R-core team might,I suppose, 
> have a definite reason for"returning" the last value. Any comments?
> >
> > Thanking you,
> > Yours sincerely,
> > AKSHAY M KULKARNI
> >
> > ----------------------------------------
> > *From:* Valentin Petzel <valentin using petzel.at>
> > *Sent:* Monday, January 9, 2023 9:18 PM
> > *To:* akshay kulkarni <akshay_e4 using hotmail.com>
> > *Cc:* R help Mailing list <r-help using r-project.org>
> > *Subject:* Re: [R] return value of {....}
> >
> > Hello Akshai,
> >
> > I think you are confusing {...} with local({...}). This one will
> evaluate the expression in a separate environment, returning the last 
> expression.
> >
> > {...} simply evaluates multiple expressions as one and returns the
> result of the last line, but it still evaluates each expression.
> >
> > Assignment returns the assigned value, so we can chain assignments 
> > like
> this
> >
> > a <- 1 + (b <- 2)
> >
> > conveniently.
> >
> > So when is {...} useful? Well, anyplace where you want to execute
> complex stuff in a function argument. E.g. you might do:
> >
> > data %>% group_by(x) %>% summarise(y = {if(x[1] > 10) sum(y) else
> mean(y)})
> >
> > Regards,
> > Valentin Petzel
> >
> > 09.01.2023 15:47:53 akshay kulkarni <akshay_e4 using hotmail.com>:
> >
> >> Dear members,
> >>                              I have the following code:
> >>
> >>> TB <- {x <- 3;y <- 5}
> >>> TB
> >> [1] 5
> >>
> >> It is consistent with the documentation: For {, the result of the 
> >> last
> expression evaluated. This has the visibility of the last evaluation.
> >>
> >> But both x AND y are created, but the "return value" is y. How can 
> >> this
> be advantageous for solving practical problems? Specifically, consider 
> the following code:
> >>
> >> F <- function(X) {  expr; expr2; { expr5; expr7}; expr8;expr10}
> >>
> >> Both expr5 and expr7 are created, and are accessible by the code
> outside of the nested braces right? But the "return value" of the 
> nested braces is expr7. So doesn't this mean that only expr7 should be
accessible?
> Please help me entangle this (of course the return value of F is 
> expr10, and all the other objects created by the preceding expressions are
deleted.
> But expr5 is not, after the control passes outside of the nested 
> braces!)
> >>
> >> Thanking you,
> >> Yours sincerely,
> >> AKSHAY M KULKARNI
> >>
> >>     [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fs
> >> tat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40
> >> som.umaryland.edu%7C75c2bc42c20b4c03601d08daf75a4405%7C717009a620de
> >> 461a88940312a395cac9%7C0%7C0%7C638094264319036570%7CUnknown%7CTWFpb
> >> GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
> >> Mn0%3D%7C3000%7C%7C%7C&sdata=Jp7JGV%2BVKkAx3SI%2Fak%2BiEIkUraGyQBTf
> >> SLWqRRUYmaQ%3D&reserved=0
> >> PLEASE do read the posting guide
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
> -project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryla
> nd.edu%7C75c2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940312a3
> 95cac9%7C0%7C0%7C638094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%
> 7C&sdata=Got4ab%2BEO6LD%2FOXqp9j9ImnSn0zi8vhV6YZMD7OJ8WU%3D&reserved=0
> >> and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.um
> aryland.edu%7C75c2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940
> 312a395cac9%7C0%7C0%7C638094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWI
> joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7
> C%7C%7C&sdata=Jp7JGV%2BVKkAx3SI%2Fak%2BiEIkUraGyQBTfSLWqRRUYmaQ%3D&res
> erved=0
> PLEASE do read the posting guide
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
> -project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryla
> nd.edu%7C75c2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940312a3
> 95cac9%7C0%7C0%7C638094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%
> 7C&sdata=Got4ab%2BEO6LD%2FOXqp9j9ImnSn0zi8vhV6YZMD7OJ8WU%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.
ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%
7C75c2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940312a395cac9%7C0%7C
0%7C638094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Jp7JGV%2BVKkAx3S
I%2Fak%2BiEIkUraGyQBTfSLWqRRUYmaQ%3D&reserved=0
PLEASE do read the posting guide
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-proje
ct.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C75c
2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940312a395cac9%7C0%7C0%7C6
38094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Got4ab%2BEO6LD%2FOXqp
9j9ImnSn0zi8vhV6YZMD7OJ8WU%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.
ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%
7C75c2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940312a395cac9%7C0%7C
0%7C638094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV
2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Jp7JGV%2BVKkAx3S
I%2Fak%2BiEIkUraGyQBTfSLWqRRUYmaQ%3D&reserved=0
PLEASE do read the posting guide
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-proje
ct.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C75c
2bc42c20b4c03601d08daf75a4405%7C717009a620de461a88940312a395cac9%7C0%7C0%7C6
38094264319036570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz
IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Got4ab%2BEO6LD%2FOXqp
9j9ImnSn0zi8vhV6YZMD7OJ8WU%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list