[R] Using R and the Tidyverse for an economic model

Mon Mar 26 12:40:40 CEST 2018

I've been translating an economic model from Python into R, and I thought
members of the list would like to see a presentation I've written about it.
I've blogged this at
http://www.j-paine.org/blog/2018/03/r-taxben-a-microsimulation-economic-model-in-r.html
, and the presentation itself is a slideshow at
http://www.j-paine.org/rtaxben/R/reveal/rtaxben.html . The slideshow is
written as one side of a conversation which reveals R and the Tidyverse a
feature at a time to a colleague not familiar with R. Those who _are_
familar with R might prefer the version at
http://www.j-paine.org/rtaxben/R/reveal/rtaxben_anim.html . Exactly the
same material, but, as explained in my introduction, quicker to read. Read
the blog post first.

Our model, R-Taxben, is a microeconomic model, which simulates at the level
of individual people rather than bulk variables such as unemployment and
inflation. It works, roughly speaking, by reading survey data about actual
households, then applying taxes and benefits to calculate net income and
expenditure from gross. It has four main parts: (1) read and process
parameters which describe the taxes and benefits; (2) read the household
data from CSV files and transform into data frames usable by the model; (3)
apply the taxes and benefits, calculating such things as council tax, VAT,
child benefit, and pensions; (4) display the results.

My slides are mainly about (2) and (4), but do touch on the others. I
suggest, for example, that legible R code for (3) could be used as a
"reference standard" against which to describe the notoriously complex UK
benefits system. Organisations such as the Child Poverty Action Group have
written handbooks for benefits advisers which try to specify the system
precisely. We'd like to use R for an electronic version of these.

I've said quite a bit about R for probing and plotting data. Not only for
economists, but for students learning about economics, fiscal policy, and
statistics. And after a brief intro to base R, I've concentrated on the
Tidyverse, because of what I see as its advantages. There are lots of small
demos of the Tidyverse scattered around the web, but fewer of big projects
which use lots of different features from it. So my examples here might be
useful.

Reliability and accuracy are vital, which is why I have more slides about
testing than about anything else, with examples of "testthat".

Near the end, I show a web interface, built using Vis.js , which displays
dataflow in the model. The aim is to make it completely scrutable, so that
none of its economic assumptions are a mystery.

We're looking for funding to go beyond this prototype. There are places
where we'll probably need help with such things as efficiency (see the
section on representation-independent selectors), efficiency again
(multiple JOINs), and the best way to overcome lack of static typing. It
would be great to have R experts, even R implementors, who were willing to
advise on this, and even to collaborate on our grant applications.

	[[alternative HTML version deleted]]