[R] print and lapply....

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Mon Nov 7 21:50:03 CET 2022


Às 19:22 de 07/11/2022, akshay kulkarni escreveu:
> Dear Rui,
>                   THanks for your reply...The point is the loop is a scraping code, and in your examples you have assumed that the body acts on i, the loop variable. Can you adapt your code to JUST PRINT the loop variable i ?
> 
> By the by, I think I have stumbled upon the answer: The lapply() caches the result, and prints the output of the function in question  immediately after printing the final i. The i's get printed serially, as the function progresses....
> 
>> lapply(1:4,function(x){print(x);Sys.sleep(x^2);x^2})
> [1] 1
> [1] 2
> [1] 3
> [1] 4
> [[1]]
> [1] 1
> 
> [[2]]
> [1] 4
> 
> [[3]]
> [1] 9
> 
> [[4]]
> [1] 16
> 
> Here x^2 's print only after 4 is printed on the console....
> 
> tHanks anyways for your reply....
> 
> THanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
> ________________________________
> From: Rui Barradas <ruipbarradas using sapo.pt>
> Sent: Tuesday, November 8, 2022 12:24 AM
> To: akshay kulkarni <akshay_e4 using hotmail.com>; R help Mailing list <r-help using r-project.org>
> Subject: Re: [R] print and lapply....
> 
> Às 18:33 de 07/11/2022, akshay kulkarni escreveu:
>> Dear Rui,
>>                      Actually, I am replacing a big for loop by the lapply() function, and report the progress:
>>
>> lapply(TP, function(i) { BODY; print(i)})
>>
>> Can you please adjust your solution in this light?
>>
>> THanking you,
>> Yours sincerely,
>> AKSHAY M KULKARNI
>> ________________________________
>> From: Rui Barradas <ruipbarradas using sapo.pt>
>> Sent: Monday, November 7, 2022 11:59 PM
>> To: akshay kulkarni <akshay_e4 using hotmail.com>; R help Mailing list <r-help using r-project.org>
>> Subject: Re: [R] print and lapply....
>>
>> Às 17:17 de 07/11/2022, akshay kulkarni escreveu:
>>> Dear members,
>>>                                 I have the following code and output:
>>>
>>>> TP <- 1:4
>>>> lapply(TP,function(x){print(x);x^2})
>>> [1] 1
>>> [1] 2
>>> [1] 3
>>> [1] 4
>>> [[1]]
>>> [1] 1
>>>
>>> [[2]]
>>> [1] 4
>>>
>>> [[3]]
>>> [1] 9
>>>
>>> [[4]]
>>> [1] 16
>>>
>>> How do I make the print function output x along with x^2, i.e not at the beginning but before each of x^2?
>>>
>>> Many thanks in advance....
>>>
>>> THanking you,
>>> Yours sincerely
>>> AKSHAY M KULKARNI
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> Hello,
>>
>> Here are two options, with ?cat and with ?message.
>>
>>
>> TP <- 1:4
>> lapply(TP, function(x){
>>      cat("x =", x, "x^2 =", x^2, "\n")
>> })
>>
>> lapply(TP, function(x){
>>      msg <- paste("x =", x, "x^2 =", x^2)
>>      message(msg)
>> })
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>>
> Hello,
> 
> 
> What do you want the lapply loop to return? If you have a BODY doing
> computations, do you want the lapply to return those values and report
> the progress?
> 
> I have chosen cat or message over print because
> 
>    - cat returns invisible(NULL),
>    - message returns invisible()
>    - print returns a value, what it prints.
> 
> Can you adapt the code below to your use case?
> 
> 
> 
> TP <- 1:4
> lapply(TP, function(x, verbose = TRUE){
>     # BODY
>     y <- rnorm(100, mean = x)
> 
>     # show progress
>     if(verbose)
>       cat("x =", x, "x^2 =", x^2, "\n")
> 
>     #return value
>     c(x = x, mean = mean(y))
> })
> 
> lapply(TP, function(x, verbose = TRUE){
>     # BODY
>     y <- rnorm(100, mean = x)
> 
>     # show progress
>     if(verbose) {
>       msg <- paste("x =", x, "x^2 =", x^2)
>       message(msg)
>     }
> 
>     #return value
>     c(x = x, mean = mean(y))
> })
> 
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
Hello,

No, the x^2 are not printed after the i's. The x^2 are the function's 
return values. The function prints the i's, then returns x^2.

As for your problem, it is now more clerar.
I would write a function accepting a url to take care of scraping and 
call it in the lapply loop. The progress report can be in the loop, like 
below.

This is a complete working example, scraping the Wikipedia list of 
countries by GDP. The urls are in a list (it's always the same, I'm not 
complicating things) and in a real scraping function I would wrap 
tryCatch around it, just in case.

First the function, then the urls list, then the lapply loop.



library(rvest)

scrape <- function(url) {
   page <- read_html(url)
   gdp <- page |>
     html_element(".wikitable") |>
     html_table() |>
     as.data.frame()
   names(gdp) <- unlist(gdp[1,, drop = TRUE])
   gdp <- gdp[-1,]
   row.names(gdp) <- NULL

   #return value
   gdp
}

wiki_gdp_url <- 
"https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)"
urls_list <- list(wiki_gdp_url, wiki_gdp_url)
TP <- seq_along(urls_list)

TP
# [1] 1 2

df_list <- lapply(TP, \(i) {
   URL <- urls_list[[i]]
   data <- scrape(URL)
   # show progress
   message("iteration: ", i)
   #return value
   data
})

str(df_list)


Hope this helps,

Rui Barradas



More information about the R-help mailing list