[Rd] [R] HTTP User-Agent header

Robert Gentleman rgentlem at fhcrc.org
Fri Jul 28 21:52:27 CEST 2006


OK, that suggests setting at the options level would solve both of your 
problems and that seems like the best approach. I don't really want to 
pass this around as a parameter through the maze of functions that might 
actually download something if we don't have to.

I think we can provide something early next week on R-devel for folks to 
test. But I suspect that as Henrik also does, the set of sites that will 
refuse us with a User-Agent header will be much larger than those that 
James has found that refuse us without it.

best wishes
   Robert


Henrik Bengtsson wrote:
> On 7/28/06, Robert Gentleman <rgentlem at fhcrc.org> wrote:
>> I wonder if it would not be better to make the user agent string
>> something that is configurable (at the time R is built) rather than at
>> run time. This would make Seth's patch about 1% as long. Or this could
>> be handled as an option. The patches are pretty extensive and allow for
>> setting the agent header by setting parameters in function calls (eg
>> download.files). I am not sure there is a good use case for that level
>> of flexibility and the additional code is substantial.
>>
>>
>> The issue that I think arises is that there are potentially other
>> systems that will be unhappy with R's identification of itself and so
>> some users may also need to turn it off.
>>
>> Any strong opinions?
> 
> Actually two:
> 
> 1) If you wish to pull down (read extract from HTML or similar) live
> data from the web, you might want to be able to "immitate" a certain
> browser.  For instance, if you tell some webserver you're a simple
> "mobile phone" or "lynx", you might be able get back very clean data.
> Some servers might also block unknown web browsers.
> 
> 2) If the webserver of a package reprocitory decided to make use of
> the user-agent string to decide what version of the reprocitory it
> should deliver, I would like to be able to trick the server.  Why?
> Many times I found myself working on a system where I do not have the
> rights to update to the latest or the developers version of R.
> However, although I have not the very latest version of R you can do
> work.  For instance, in Bioconductor the biocLite() & co gives you
> either the stable or the developers of Bioconductor depending on your
> R version, but looking into the biocLite() code and beyond, you find
> that you actually can install a Bioconductor v1.9 package in R v2.3.1.
>  It can be risky business, but if you know what you're doing, it can
> save your day (or week).
> 
> Cheers
> 
> Henrik
> 
>>
>> James P. Howard, II wrote:
>>> On 7/28/06, Seth Falcon <sfalcon at fhcrc.org> wrote:
>>>
>>>> I have a rough draft patch, see below, that adds a User-Agent header
>>>> to HTTP requests made in R via download.file.  If there is interest, I
>>>> will polish it.
>>> It looks right, but I am running under Windows without a compiler.
>>>
>> --
>> Robert Gentleman, PhD
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M2-B876
>> PO Box 19024
>> Seattle, Washington 98109-1024
>> 206-667-7700
>> rgentlem at fhcrc.org
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the R-devel mailing list