[R] Making objects global in a package

R. Mark Sharp rm@h@rp @end|ng |rom me@com
Sat Jul 14 03:13:23 CEST 2018

I would usually use a function for this. It may not be more R like, but it is more readable to me. If you want, to keep the columns in a file, you could have the function initialize itself on the first call. 

> On Jul 13, 2018, at 7:51 PM, Michael Hannon <jmhannon.ucdavis using gmail.com> wrote:
> Greetings.  I'm putting together a small package in which I use
> `dplyr::read_csv()` to read CSV files from several different sources.  I do
> this in several different files, but with various kinds of subsequent
> processing, depending on the file.
> I find it useful to specify column types, as the apparent data type of a given
> column sometimes changes unexpectedly deep into the file.  I.e., a field that
> consistently looks like an integer, suddenly becomes a fraction:
>    1, 1, ..., 1, 1/2, 1, ...
> Hence, the column type has to be treated as a character, rather than as an
> integer (with the possibility of later conversion to double, if necessary).
> (This is just an example.)
> Therefore I use the `col_types` argument in all of the calls to `read_csv()`.
> These calls are spread over several files, but I want the keep all of the
> column types in a single place, yet have them available in each of the several
> files.  This is just for the sake of maintainability.
> At the moment I do this by putting the column-type definitions into a single,
> file:
>    000_define_data_attributes.R
> that:
>    (1) is named so that it's parsed first by `devtools::build()`
>    (2) sets up an environment and stuffs the column types into it:
>            data_env <- new.env(parent=emptyenv())
>            data_env$col_types_alpha <- list(
>                Date = col_date(),
>                var1 = col_double(),
>                ...
>            )
> There are a few other things that go into the file as well.
> Then I pick off the appropriate stuff from the environment in the other files:
>    foo_alpha <- read_csv("alpha.csv", col_types = data_env$col_types_alpha)
> This seems to work, but it doesn't "feel" right to me.  (If this were Python,
> people would accuse me of being "non-pythonic").
> Hence, I'm seeking suggestions for the best practice for this kind of thing.
> BTW, I note that both the sources of data ("alpha", etc.) and the column types
> are more or less guaranteed to be static for the foreseeable future.  Hence,
> there really isn't much danger in just replicating the column-type definitions
> in each of the various files, which would obviate the need for the "000..."
> file.  In other words, this is mostly a style thing.
> Thanks for any advice you can provide.
> -- Mike
