[R] read.*: How to read from a URL?

Martin Morgan mtmorgan at fhcrc.org
Thu Dec 11 00:45:11 CET 2008


Martin Morgan wrote:
> Prof Brian Ripley wrote:
>> On Wed, 10 Dec 2008, hadley wickham wrote:
>>
>>> Hi Michael,
>>>
>>> In general, I think you should be able to do:
>>>
>>> gimage <- read.jpeg(url(gimageloc))
>>
>> Note that would not be really correct: it would need to be
>>
>> gimage <- read.jpeg(con <- url(gimageloc))
>> close(con)
>>
>> since it otherwise leaks a connection (which would eventually be closed).
>>
>> However, from ?read.jpeg
>>
>> Arguments:
>>
>> filename: filename of JPEG image
>>
>> so it does not accept a connection (and the source code wll confirm 
>> that). In fact virtually all functions that accept a 'file name or 
>> connection' will work with URLs, as file() accepts URLs as well as 
>> file names (see ?file).
>>
>> The issue is that writers of third-party readers should be encouraged 
>> to support connections (which have been around for ca 7 years in R).
>> It is ammazing how people take such innovations for granted.
> 
> Perhaps the discussion belongs on R-devel, but is there an example of a 
> user-contributed package that uses R's connections, either for parsing a 
> URL or, for instance, a compressed file?

To clarify, I meant using a connection from the C level.

> Martin
> 
>>
>>> or alternatively use the EBImage from bioconductor which will read
>>> from a url automatically (it also opens a much wider range of file
>>> types)
>>>
>>> library(EBImage)
>>> img <- readImage(gimageloc, TrueColor)
>>>
>>> Hadley
>>>
>>>
>>> On Wed, Dec 10, 2008 at 1:17 PM, Michael Friendly <friendly at yorku.ca> 
>>> wrote:
>>>> The question is how to use a URL in place of a file= argument for
>>>> read.*.functions that do
>>>> not support this internally.
>>>>
>>>> e.g., utils::read.table() and her family all support a file= 
>>>> argument that
>>>> can take a URL
>>>> equally well as a local file.  So, if I have a file on the web, I can
>>>> equally well do
>>>>
>>>>> langren <- read.csv("langrens.csv", header=TRUE)
>>>>> langren <-
>>>>> read.csv("http://euclid.psych.yorku.ca/SCS/Gallery/Private/langrens.csv", 
>>>>>
>>>>> header=TRUE)
>>>>
>>>> where the latter is more convenient for posts to this list or 
>>>> distributed
>>>> examples.
>>>> rimage::read.jpeg() doesn't support URLs, and the only way I've 
>>>> found is to
>>>> download the
>>>> image file from a URL to a temp file, in several steps.
>>>> This is probably a more general problem than just read.jpeg,
>>>> so maybe there is a general idiom for this case, or better-- other 
>>>> read.*
>>>> functions could
>>>> be encouraged to support URLs.
>>>>
>>>>> library(rimage)
>>>>> # local file: OK
>>>>> gimage <-
>>>>> read.jpeg("C:/Documents/milestone/images/vanLangren/google-toledo-rome3.jpg") 
>>>>>
>>>>>
>>>>> gimageloc <-
>>>>> "http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg" 
>>>>>
>>>>> dest <- paste(tempfile(),'.jpg', sep='')
>>>>> download.file(gimageloc, dest, mode="wb")
>>>> trying URL
>>>> 'http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg' 
>>>>
>>>> Content type 'image/jpeg' length 35349 bytes (34 Kb)
>>>> opened URL
>>>> downloaded 34 Kb
>>>>
>>>>> dest
>>>> [1] 
>>>> "C:\\DOCUME~1\\default\\LOCALS~1\\Temp\\Rtmp9nNTdV\\file5f906952.jpg"
>>>>> #  Is there something simpler??
>>>>> gimage <- read.jpeg(dest)
>>>>
>>>>> #  I thought file() might work, but evidently not.
>>>>> gimage <- read.jpeg(file(gimageloc))
>>>> Error in read.jpeg(file(gimageloc)) : Can't open file.
>>>>>
>>>>
>>>> -- 
>>>> Michael Friendly     Email: friendly AT yorku DOT ca Professor, 
>>>> Psychology
>>>> Dept.
>>>> York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>>> 4700 Keele Street    http://www.math.yorku.ca/SCS/friendly.html
>>>> Toronto, ONT  M3J 1P3 CANADA
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> -- 
>>> http://had.co.nz/
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
> 
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the R-help mailing list