[R] Strange behaviour to download zip file using R

David Winsemius dwinsemius at comcast.net
Thu Aug 3 20:59:37 CEST 2017


> On Aug 3, 2017, at 11:38 AM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
> 
> Hi again,
> 
> I was trying to download stock market data from below link :
> 
> https://www.nseindia.com/products/content/equities/equities/archieve_eq.htm
> 
> Input choice :
> 
> Select Report: Bhavcopy
> Date(DD-MM-YYYY): 03-03-2010
> 
> If you put manual input as above, then we will get option for manual
> download of file :
> 
> cm03MAR2010bhav.csv.zip
> 
> However I then tried to use R to have some automatic download :
> 
>> download.file('https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip', 'aa.zip')
> 
> trying URL 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
> 
> Error in download.file("https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip",
> :
> 
>  cannot open URL
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
> 
> In addition: Warning message:
> 
> In download.file("https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip",
> :
> 
>  cannot open URL
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip':
> HTTP status was '403 Forbidden'
> 
> Ofcourse I I place below direct link
> 'https://www.nseindia.com/content/historical/EQUITIES/2010/MAR/cm03MAR2010bhav.csv.zip'
> in address-bar of my Chrome, I am denied permission
> 
> Do you have any idea what is going on here?

Yes. They are trying to prevent you from accessing their files against their terms of service: Item # 12 in their TOS:

	• You may not conduct any systematic or automated data collection activities (including scraping, data mining, data extraction and data harvesting) on or in relation to our website without our express written consent.

> Do I need to get some setting?

No, you just need to obey the law.

> 
> Any pointer will be highly appreciated.
> 
> 

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law



More information about the R-help mailing list