[R] Gender balance in R

Scott Kostyshak skostysh at princeton.edu
Wed Nov 26 07:19:42 CET 2014


On Tue, Nov 25, 2014 at 1:15 PM, Martin Morgan <mtmorgan at fredhutch.org> wrote:
> On 11/25/2014 04:11 AM, Scott Kostyshak wrote:
>>
>> On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee <sarah.goslee at gmail.com>
>> wrote:
>>>
>>> I took a look at apparent gender among list participants a few years ago:
>>> https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html
>>>
>>> Same general thing: very few regular participants on the list were
>>> women. I don't see any sign that that has changed in the last three
>>> years. The bar to participation in the R-help list is much, much lower
>>> than that to become a developer.
>>
>>
>> I plotted the gender of posters on r-help over time. The plot is here:
>> https://twitter.com/scottkosty/status/449933971644633088
>>
>> The code to reproduce that plot is here:
>> https://github.com/scottkosty/genderAnalysis
>> The R file there will call devtools::install_github to install a
>> package from Github used for guessing the gender based on the first
>> name (https://github.com/scottkosty/gender).
>
>
> It would be great to include in your package the script that scraped author
> names from R-help archives (I guess that's what you did?). Presumably it
> easily applies to other mailing lists hosted at the same location (R-devel,
> further along the ladder from user to developer, and Bioconductor /
> Bioc-devel, in a different domain and perhaps confounded with a different
> 'feel' to the list). Also the R community is definitely international, so
> finding more versatile gender-assignment approaches seems important.

I just put the script up on https://github.com/scottkosty/genderAnalysis
I don't have much time at the moment to generalize it, but a pull
request is always welcome. Alternatively, anyone is welcome (at least
as far as I'm concerned) to take the script and modify it for any
purpose.

> it might be interesting to ask about participation in mailing list forums
> versus other, and in particular the recent Bioconductor transition from
> mailing list to 'StackOverflow' style support forum
> (https://support.bioconductor.org) -- on the one hand the 'gamification'
> elements might seem to only entrench male participation, while on the other
> we have already seen increased (quantifiable) and broader (subjective)
> participation from the Bioconductor community. I'd be happy to make support
> site usage data available, and am interested in collaborating in an
> academically well-founded analysis of this data; any interested parties
> please feel free to contact me off-list.

I would be interested in collaborating on such a project in the future also.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

>
> Martin Morgan
> Bioconductor
>
>
>>
>> Note also on that tweet that Gabriela de Queiroz posted it, who is the
>> founder of R-ladies; and that David Smith showed interest in
>> discussing the topic. So there is definitely demand for some data
>> analysis and discussion on the topic.
>>
>>> It would be interesting to look at the stats for CRAN packages as well.
>>>
>>> The very low percentage of regular female participants is one of the
>>> things that keeps me active on this list: to demonstrate that it's not
>>> only men who use R and participate in the community.
>>
>>
>> Thank you for that!
>>
>> Scott
>>
>>
>> --
>> Scott Kostyshak
>> Economics PhD Candidate
>> Princeton University
>>
>>> (If you decide to do the stats for 2014, be aware that I've been out
>>> on medical leave for the past two months, so the numbers are even
>>> lower than usual.)
>>>
>>> Sarah
>>>
>>> On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw
>>> <maarten.blaauw at qub.ac.uk> wrote:
>>>>
>>>> Hi there,
>>>>
>>>> I can't help to notice that the gender balance among R developers and
>>>> ordinary members is extremely skewed (as it is with open source software
>>>> in
>>>> general).
>>>>
>>>> Have a look at http://www.r-project.org/foundation/memberlist.html - at
>>>> most
>>>> a handful of women are listed among the 'supporting members', and none
>>>> at
>>>> all among the 29 'ordinary members'.
>>>>
>>>> On the other hand I personally know many happy R users of both genders.
>>>>
>>>> My questions are thus: Should R developers (and users) be worried that
>>>> the
>>>> 'other half' is excluded? If so, how could female R users/developers be
>>>> persuaded to become more visible (e.g. added as supporting or ordinary
>>>> members)?
>>>>
>>>> Thanks,
>>>>
>>>> Maarten
>>>>
>>> --
>>> Sarah Goslee
>>> http://www.functionaldiversity.org
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793



More information about the R-help mailing list