[R] Popularity of R, SAS, SPSS, Stata...

Keo Ormsby keo.ormsby2 at gmail.com
Tue Jun 22 22:26:43 CEST 2010


>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Ivan Calandra
>> Sent: Sunday, June 20, 2010 3:47 PM
>> To: r-help at r-project.org
>> Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...
>>
>> Bob,
>>
>> I have no idea whether it is realistic, but if you look for the papers
>> that used R or SAS (or anything), you might get better results by
>> searching for the way R and SAS are cited.
>>     
>
> Hi Ivan, that was what I tried when more generic keywords failed. However, almost no one seems to use that citation. For example, in 2009, only 28 papers contain "R Foundation" and 61 contain Bioconductor, which uses R. One single paper contains both. I appreciate the idea though!
>
> Thanks,
> Bob
>
>   
Hi Bob,
Great work. I just did a quick-and-dirty search on Google Scholar, but 
trying to be less stringent about what to consider a "hit". I used for 
R: "R statistical" OR "R package", for SPSS; just "SPSS", for S-plus; 
ditto, for SAS; "SAS statistical" OR "SAS institute", for STATA; "STATA" 
AND "statistical". For estimating a "background" of scholarly 
statistical publications I just searched for the word "statistical" and 
did this and searching for each year. Of course this "methodology" is 
far from acceptable, and includes many things that were not necessarily 
works that used the software, and may have only mentioned it, but it 
does throw some interesting things. I am using *cited* in the sense of 
*mentioned in the article as to be indexed by Google*, not as an 
academic citation.
It appears that SPSS is going down from having the lion's-share in mid 
2000's, but remains the most cited. SAS had a big dip from being the 
most cited, but is having a come-back. STATA, and R are both rising 
steadly, with R having a more than 16-fold increase from 2000, but still 
being a small fraction of total citations. There is a dip in 2010 for 
STATA, R, and S-Plus, and a big peak for SAS, but I am not sure these 
are artifacts form Goggle's algorithm.

If anyone wants to give it a look, here's the data:
Search 
term,2010,2009,2008,2007,2006,2005,2004,2003,2002,2001,2000,1999,1998,1997,1996,1995
SPSS,28900,46700,64600,84600,97600,104000,105000,91800,67700,53600,40600,22600,15300,10100,7530,6330
SAS,15300,15000,14700,15200,14900,15900,16800,16800,16200,15500,14600,13100,11900,10700,9270,8950
STATA,9800,19300,16900,14600,12800,10700,8930,6780,4900,4050,3280,2120,1560,959,642,501
S-Plus,1900,4820,4900,5050,4950,4780,5100,4110,3610,2970,2540,2150,1780,1500,1200,990
R,3470,6140,4660,3510,2450,1700,1090,642,416,294,247,188,203,178,133,158
statistical,181000,347000,709000,885000,1010000,1030000,1060000,1110000,1090000,1020000,968000,887000,786000,718000,670000,580000

Best wishes,
Keo.



More information about the R-help mailing list