[R] read.*: How to read from a URL?

Martin Morgan mtmorgan at fhcrc.org
Thu Dec 11 00:08:25 CET 2008


Prof Brian Ripley wrote:
> On Wed, 10 Dec 2008, hadley wickham wrote:
> 
>> Hi Michael,
>>
>> In general, I think you should be able to do:
>>
>> gimage <- read.jpeg(url(gimageloc))
> 
> Note that would not be really correct: it would need to be
> 
> gimage <- read.jpeg(con <- url(gimageloc))
> close(con)
> 
> since it otherwise leaks a connection (which would eventually be closed).
> 
> However, from ?read.jpeg
> 
> Arguments:
> 
> filename: filename of JPEG image
> 
> so it does not accept a connection (and the source code wll confirm 
> that). In fact virtually all functions that accept a 'file name or 
> connection' will work with URLs, as file() accepts URLs as well as file 
> names (see ?file).
> 
> The issue is that writers of third-party readers should be encouraged to 
> support connections (which have been around for ca 7 years in R).
> It is ammazing how people take such innovations for granted.

Perhaps the discussion belongs on R-devel, but is there an example of a 
user-contributed package that uses R's connections, either for parsing a 
URL or, for instance, a compressed file?

Martin

> 
>> or alternatively use the EBImage from bioconductor which will read
>> from a url automatically (it also opens a much wider range of file
>> types)
>>
>> library(EBImage)
>> img <- readImage(gimageloc, TrueColor)
>>
>> Hadley
>>
>>
>> On Wed, Dec 10, 2008 at 1:17 PM, Michael Friendly <friendly at yorku.ca> 
>> wrote:
>>> The question is how to use a URL in place of a file= argument for
>>> read.*.functions that do
>>> not support this internally.
>>>
>>> e.g., utils::read.table() and her family all support a file= argument 
>>> that
>>> can take a URL
>>> equally well as a local file.  So, if I have a file on the web, I can
>>> equally well do
>>>
>>>> langren <- read.csv("langrens.csv", header=TRUE)
>>>> langren <-
>>>> read.csv("http://euclid.psych.yorku.ca/SCS/Gallery/Private/langrens.csv", 
>>>>
>>>> header=TRUE)
>>>
>>> where the latter is more convenient for posts to this list or 
>>> distributed
>>> examples.
>>> rimage::read.jpeg() doesn't support URLs, and the only way I've found 
>>> is to
>>> download the
>>> image file from a URL to a temp file, in several steps.
>>> This is probably a more general problem than just read.jpeg,
>>> so maybe there is a general idiom for this case, or better-- other 
>>> read.*
>>> functions could
>>> be encouraged to support URLs.
>>>
>>>> library(rimage)
>>>> # local file: OK
>>>> gimage <-
>>>> read.jpeg("C:/Documents/milestone/images/vanLangren/google-toledo-rome3.jpg") 
>>>>
>>>>
>>>> gimageloc <-
>>>> "http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg" 
>>>>
>>>> dest <- paste(tempfile(),'.jpg', sep='')
>>>> download.file(gimageloc, dest, mode="wb")
>>> trying URL
>>> 'http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg' 
>>>
>>> Content type 'image/jpeg' length 35349 bytes (34 Kb)
>>> opened URL
>>> downloaded 34 Kb
>>>
>>>> dest
>>> [1] 
>>> "C:\\DOCUME~1\\default\\LOCALS~1\\Temp\\Rtmp9nNTdV\\file5f906952.jpg"
>>>> #  Is there something simpler??
>>>> gimage <- read.jpeg(dest)
>>>
>>>> #  I thought file() might work, but evidently not.
>>>> gimage <- read.jpeg(file(gimageloc))
>>> Error in read.jpeg(file(gimageloc)) : Can't open file.
>>>>
>>>
>>> -- 
>>> Michael Friendly     Email: friendly AT yorku DOT ca Professor, 
>>> Psychology
>>> Dept.
>>> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>> 4700 Keele Street    http://www.math.yorku.ca/SCS/friendly.html
>>> Toronto, ONT  M3J 1P3 CANADA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> -- 
>> http://had.co.nz/
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the R-help mailing list