[R] read.*: How to read from a URL?

Duncan Murdoch murdoch at stats.uwo.ca
Thu Dec 11 00:31:11 CET 2008


On 10/12/2008 5:41 PM, Prof Brian Ripley wrote:
> On Wed, 10 Dec 2008, hadley wickham wrote:
> 
>> Hi Michael,
>>
>> In general, I think you should be able to do:
>>
>> gimage <- read.jpeg(url(gimageloc))
> 
> Note that would not be really correct: it would need to be
> 
> gimage <- read.jpeg(con <- url(gimageloc))
> close(con)
> 
> since it otherwise leaks a connection (which would eventually be closed).

I think one of the reasons connections are underused is because of 
uncertainty about things like this.  The docs are unclear about when 
automatic cleanup occurs.  Does "eventually" mean at the next garbage 
collection time?  If our 128 entry connection table is full, and someone 
tries to open a new one, will there be a forced gc, to free up any 
unreferenced ones?

If the answer to both questions is "yes", then the explicit close is 
more efficient than relying on the automatic one, but not strictly 
necessary, and it would certainly be more convenient than having to 
remember when the close() is needed.

Duncan Murdoch

> 
> However, from ?read.jpeg
> 
> Arguments:
> 
> filename: filename of JPEG image
> 
> so it does not accept a connection (and the source code wll confirm that). 
> In fact virtually all functions that accept a 'file name or connection' 
> will work with URLs, as file() accepts URLs as well as file names (see 
> ?file).
> 
> The issue is that writers of third-party readers should be encouraged to 
> support connections (which have been around for ca 7 years in R).
> It is ammazing how people take such innovations for granted.
> 
>> or alternatively use the EBImage from bioconductor which will read
>> from a url automatically (it also opens a much wider range of file
>> types)
>>
>> library(EBImage)
>> img <- readImage(gimageloc, TrueColor)
>>
>> Hadley
>>
>>
>> On Wed, Dec 10, 2008 at 1:17 PM, Michael Friendly <friendly at yorku.ca> wrote:
>>> The question is how to use a URL in place of a file= argument for
>>> read.*.functions that do
>>> not support this internally.
>>>
>>> e.g., utils::read.table() and her family all support a file= argument that
>>> can take a URL
>>> equally well as a local file.  So, if I have a file on the web, I can
>>> equally well do
>>>
>>>> langren <- read.csv("langrens.csv", header=TRUE)
>>>> langren <-
>>>> read.csv("http://euclid.psych.yorku.ca/SCS/Gallery/Private/langrens.csv",
>>>> header=TRUE)
>>> where the latter is more convenient for posts to this list or distributed
>>> examples.
>>> rimage::read.jpeg() doesn't support URLs, and the only way I've found is to
>>> download the
>>> image file from a URL to a temp file, in several steps.
>>> This is probably a more general problem than just read.jpeg,
>>> so maybe there is a general idiom for this case, or better-- other read.*
>>> functions could
>>> be encouraged to support URLs.
>>>
>>>> library(rimage)
>>>> # local file: OK
>>>> gimage <-
>>>> read.jpeg("C:/Documents/milestone/images/vanLangren/google-toledo-rome3.jpg")
>>>>
>>>> gimageloc <-
>>>> "http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg"
>>>> dest <- paste(tempfile(),'.jpg', sep='')
>>>> download.file(gimageloc, dest, mode="wb")
>>> trying URL
>>> 'http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg'
>>> Content type 'image/jpeg' length 35349 bytes (34 Kb)
>>> opened URL
>>> downloaded 34 Kb
>>>
>>>> dest
>>> [1] "C:\\DOCUME~1\\default\\LOCALS~1\\Temp\\Rtmp9nNTdV\\file5f906952.jpg"
>>>> #  Is there something simpler??
>>>> gimage <- read.jpeg(dest)
>>>> #  I thought file() might work, but evidently not.
>>>> gimage <- read.jpeg(file(gimageloc))
>>> Error in read.jpeg(file(gimageloc)) : Can't open file.
>>> --
>>> Michael Friendly     Email: friendly AT yorku DOT ca Professor, Psychology
>>> Dept.
>>> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>> 4700 Keele Street    http://www.math.yorku.ca/SCS/friendly.html
>>> Toronto, ONT  M3J 1P3 CANADA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>> -- 
>> http://had.co.nz/
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list