[R] read.*: How to read from a URL?
murdoch at stats.uwo.ca
Thu Dec 11 00:31:11 CET 2008
On 10/12/2008 5:41 PM, Prof Brian Ripley wrote:
> On Wed, 10 Dec 2008, hadley wickham wrote:
>> Hi Michael,
>> In general, I think you should be able to do:
>> gimage <- read.jpeg(url(gimageloc))
> Note that would not be really correct: it would need to be
> gimage <- read.jpeg(con <- url(gimageloc))
> since it otherwise leaks a connection (which would eventually be closed).
I think one of the reasons connections are underused is because of
uncertainty about things like this. The docs are unclear about when
automatic cleanup occurs. Does "eventually" mean at the next garbage
collection time? If our 128 entry connection table is full, and someone
tries to open a new one, will there be a forced gc, to free up any
If the answer to both questions is "yes", then the explicit close is
more efficient than relying on the automatic one, but not strictly
necessary, and it would certainly be more convenient than having to
remember when the close() is needed.
> However, from ?read.jpeg
> filename: filename of JPEG image
> so it does not accept a connection (and the source code wll confirm that).
> In fact virtually all functions that accept a 'file name or connection'
> will work with URLs, as file() accepts URLs as well as file names (see
> The issue is that writers of third-party readers should be encouraged to
> support connections (which have been around for ca 7 years in R).
> It is ammazing how people take such innovations for granted.
>> or alternatively use the EBImage from bioconductor which will read
>> from a url automatically (it also opens a much wider range of file
>> img <- readImage(gimageloc, TrueColor)
>> On Wed, Dec 10, 2008 at 1:17 PM, Michael Friendly <friendly at yorku.ca> wrote:
>>> The question is how to use a URL in place of a file= argument for
>>> read.*.functions that do
>>> not support this internally.
>>> e.g., utils::read.table() and her family all support a file= argument that
>>> can take a URL
>>> equally well as a local file. So, if I have a file on the web, I can
>>> equally well do
>>>> langren <- read.csv("langrens.csv", header=TRUE)
>>>> langren <-
>>> where the latter is more convenient for posts to this list or distributed
>>> rimage::read.jpeg() doesn't support URLs, and the only way I've found is to
>>> download the
>>> image file from a URL to a temp file, in several steps.
>>> This is probably a more general problem than just read.jpeg,
>>> so maybe there is a general idiom for this case, or better-- other read.*
>>> functions could
>>> be encouraged to support URLs.
>>>> # local file: OK
>>>> gimage <-
>>>> gimageloc <-
>>>> dest <- paste(tempfile(),'.jpg', sep='')
>>>> download.file(gimageloc, dest, mode="wb")
>>> trying URL
>>> Content type 'image/jpeg' length 35349 bytes (34 Kb)
>>> opened URL
>>> downloaded 34 Kb
>>>  "C:\\DOCUME~1\\default\\LOCALS~1\\Temp\\Rtmp9nNTdV\\file5f906952.jpg"
>>>> # Is there something simpler??
>>>> gimage <- read.jpeg(dest)
>>>> # I thought file() might work, but evidently not.
>>>> gimage <- read.jpeg(file(gimageloc))
>>> Error in read.jpeg(file(gimageloc)) : Can't open file.
>>> Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology
>>> York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>> 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
>>> Toronto, ONT M3J 1P3 CANADA
>>> R-help at r-project.org mailing list
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> R-help at r-project.org mailing list
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help