[R] [External] Re: read.csv fails in R console in Ubuntu terminal but works in RStudio after R 3.6.3 upgrade to R 4.0.2?

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Fri Jul 17 03:59:26 CEST 2020


On my Ubuntu system the download with read.csv succeeds in an R
console if I set the HTTPUserAgent and download.file.method options to
match the ones used by RStudio.

Given how picky the server is being I would worry about whether this
use is in line with the site's terms of service.

Best,

luke

On Thu, 16 Jul 2020, Ista Zahn wrote:

> On Thu, Jul 16, 2020 at 5:15 PM Ista Zahn <istazahn using gmail.com> wrote:
>>
>> On Thu, Jul 16, 2020 at 8:18 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>>
>>> Hello,
>>>
>>> Thanks, but no, download.file still gives 403 Forbidden with both method
>>> = "libcurl" and method = "wget".
>>
>> I think that makes it "not an R question". Ask on
>> https://unix.stackexchange.com/ maybe?
>
> Oh, sorry I misread your message. Nevertheless:
>
> $ curl "https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
> <HTML><HEAD>
> <TITLE>Access Denied</TITLE>
> </HEAD><BODY>
> <H1>Access Denied</H1>
>
> You don't have permission to access
> "http://old.nasdaq.com/screening/companies-by-name.aspx?"
> on this server.<P>
> Reference #18.5506d217.1594934303.938edcb
> </BODY>
> </HTML>
>
> $ wget "https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
> --2020-07-16 17:19:12--
> https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download
> Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
> Resolving old.nasdaq.com (old.nasdaq.com)... 2600:1400:9000:28f::1b46,
> 2600:1400:9000:29b::1b46, 23.78.161.120
> Connecting to old.nasdaq.com
> (old.nasdaq.com)|2600:1400:9000:28f::1b46|:443... connected.
> HTTP request sent, awaiting response... 403 Forbidden
> 2020-07-16 17:19:12 ERROR 403: Forbidden.
>
> I don't think this is an R problem.
>
> Best,
> Ista
>
>>
>> Best,
>> Ista
>>
>>>
>>> Rui Barradas
>>>
>>> Às 05:31 de 16/07/20, Jeff Newmiller escreveu:
>>>> Perhaps read FAQ 7.43? [1]
>>>>
>>>> [1] https://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-enable-secure-https-downloads-in-R_003f
>>>>
>>>> On July 15, 2020 4:02:27 PM PDT, Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>>>> Hello,
>>>>>
>>>>> R 4.0.2 on Ubuntu 20.04 LTS, sessionInfo below.
>>>>>
>>>>> I'm also unable to read the file with Rscript from the Ubuntu terminal
>>>>> but the error is not the same as the OP's.
>>>>>
>>>>>
>>>>> The first try was a file test1.R with the following commands.
>>>>>
>>>>> x<-"https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
>>>>> read.csv(x, as.is=TRUE, na="n/a")
>>>>>
>>>>>
>>>>> And run with Rscript
>>>>>
>>>>> rui using rui:~$ Rscript --vanilla test1.R
>>>>> Error in file(file, "rt") :
>>>>>    cannot open the connection to
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
>>>>> Calls: read.csv -> read.table -> file
>>>>> In addition: Warning message:
>>>>> In file(file, "rt") :
>>>>>    cannot open URL
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download':
>>>>>
>>>>> HTTP status was '403 Forbidden'
>>>>> Execution halted
>>>>>
>>>>>
>>>>>
>>>>> The second try was download.file() and then read it.
>>>>> File test2.R is:
>>>>>
>>>>> x<-"https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download"
>>>>> download.file(x, "companylist.csv")
>>>>> read.csv("companylist.csv", as.is=TRUE, na="n/a")
>>>>>
>>>>>
>>>>> But this too failed with error 403 Forbiden.
>>>>>
>>>>> rui using rui:~$ Rscript --vanilla test2.R
>>>>> trying URL
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
>>>>> Error in download.file(x, "companylist.csv") :
>>>>>    cannot open URL
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
>>>>> In addition: Warning message:
>>>>> In download.file(x, "companylist.csv") :
>>>>>    cannot open URL
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download':
>>>>>
>>>>> HTTP status was '403 Forbidden'
>>>>> Execution halted
>>>>>
>>>>>
>>>>> This is my session info.
>>>>>
>>>>> rui using rui:~$ Rscript --vanilla -e 'sessionInfo()'
>>>>> R version 4.0.2 (2020-06-22)
>>>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>>> Running under: Ubuntu 20.04 LTS
>>>>>
>>>>> Matrix products: default
>>>>> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
>>>>> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
>>>>>
>>>>> locale:
>>>>>   [1] LC_CTYPE=pt_PT.UTF-8       LC_NUMERIC=C
>>>>>   [3] LC_TIME=pt_PT.UTF-8        LC_COLLATE=pt_PT.UTF-8
>>>>>   [5] LC_MONETARY=pt_PT.UTF-8    LC_MESSAGES=pt_PT.UTF-8
>>>>>   [7] LC_PAPER=pt_PT.UTF-8       LC_NAME=C
>>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] compiler_4.0.2
>>>>>
>>>>>
>>>>>
>>>>> Às 08:45 de 15/07/20, Sam H escreveu:
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to download some data using read.csv and it works
>>>>> perfectly in
>>>>>> RStudio and fails in the R console in the terminal in Ubuntu 18.04
>>>>> after
>>>>>> upgrading from R 3.6.3 to 4.0.2. Before upgrading this worked in the
>>>>> R
>>>>>> console in the terminal also without any issues.
>>>>>>
>>>>>> Why would that be? How to fix this?
>>>>>>
>>>>>> Below please find R code output and sessionInfo().
>>>>>>
>>>>>> *Works in RStudio*
>>>>>>
>>>>>>>
>>>>> read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download",
>>>>> header=TRUE, as.is=TRUE, na="n/a")
>>>>>>       Symbol                                                Name
>>>>>> LastSale MarketCap IPOyear1      TXG
>>>>>> 10x Genomics, Inc.  87.4400     $8.6B    20192       YI
>>>>>>                              111, Inc.   6.4800  $533.69M    20183
>>>>>> PIH              1347 Property Insurance Holdings, Inc.   4.5350
>>>>>> $27.52M    2014
>>>>>>    sessionInfo()
>>>>>> R version 4.0.2 (2020-06-22)
>>>>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>>>> Running under: Ubuntu 18.04.4 LTS
>>>>>>
>>>>>> Matrix products: default
>>>>>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
>>>>>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
>>>>>>
>>>>>> locale:
>>>>>>    [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>>> LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>>>    [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>>> LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>>>    [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>>>> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>>
>>>>>> attached base packages:[1] stats     graphics  grDevices utils
>>>>>> datasets  methods   base
>>>>>>
>>>>>> loaded via a namespace (and not attached):[1] compiler_4.0.2
>>>>> tools_4.0.2
>>>>>>
>>>>>> *Fails in R console in terminal*
>>>>>>
>>>>>>      >
>>>>> read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download",
>>>>>> header=TRUE, as.is=TRUE, na="n/a")
>>>>>> Error in file(file, "rt") :
>>>>>>     cannot open the connection to
>>>>>>
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download'
>>>>>> In addition: Warning message:
>>>>>> In file(file, "rt") :
>>>>>>     URL
>>>>> 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download':
>>>>>> status was 'Failure when receiving data from the peer'> traceback()3:
>>>>>> file(file, "rt")2: read.table(file = file, header = header, sep =
>>>>> sep,
>>>>>> quote = quote,
>>>>>>          dec = dec, fill = fill, comment.char = comment.char, ...)1:
>>>>>>
>>>>> read.csv("https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=1&render=download",
>>>>>>          header = TRUE, as.is = TRUE, na = "n/a")>  sessionInfo()
>>>>>> R version 4.0.2 (2020-06-22)
>>>>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>>>> Running under: Ubuntu 18.04.4 LTS
>>>>>>
>>>>>> Matrix products: default
>>>>>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
>>>>>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
>>>>>>
>>>>>> locale:
>>>>>>    [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>>>    [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>>>    [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>>>    [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>>>    [9] LC_ADDRESS=C               LC_TELEPHONE=C            [11]
>>>>>> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>>
>>>>>> attached base packages:[1] stats     graphics  grDevices utils
>>>>>> datasets  methods   base
>>>>>>
>>>>>> loaded via a namespace (and not attached):[1] compiler_4.0.2>
>>>>>>
>>>>>> I also asked this question here
>>>>>>
>>>>> https://stackoverflow.com/questions/62898008/why-read-csv-fails-in-r-console-in-ubuntu-terminal-but-works-in-rstudio-after-r
>>>>>> . Since there was no answer on stackoverflow I sent this question
>>>>> also to
>>>>>> this list.
>>>>>>
>>>>>> Best regards,
>>>>>> Sam
>>>>>>
>>>>>>     [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-help mailing list