[Rd] Multiple versions of data in a package

Chris Wallace chris.wallace at cimr.cam.ac.uk
Mon Jul 21 15:58:11 CEST 2014


Dear R-devel,

I am writing for help on how I should include parallel sets of data in 
my package.

Brief summary: I am new to using data within packages.  I want a user to 
be able to specify one of two alternative versions of within-package 
datasets to use, and I want to load just that one.  I have a solution 
that works, but it doesn't seem as simple as it should be from a user's 
point of view, nor does it seem robust to errors.  What should I do better?

More details:

My package is a relatively simple set of plotting functions, that plot 
user supplied data across a number of SNPs from their experiment, and 
annotates the plot using some external data.  For example, the user supplies

SNP value
A 2
B 1.2
C 7.8

etc

The external datasets will allow me to look up the location of SNPs A, 
B, C, ... on the human genome so the user's data can be plotted in 
relation to that map, to annotate the position of local genes etc.  My 
problem is that there is no single map of the genome, so I can prepare 
external data with the positions of SNPs and genes in version 36 or 37.  
To allow this, my data/ directory contains

snps_37.Rdata, snps_36.RData, genes_37.RData, genes_36.RData

the objects in these files are called, respectively,

snps, snps, genes, genes

Therefore, a user types

 > data(snps_37)
 > data(genes_37)

or

 > data(snps_36)
 > data(genes_36)

to set up build 36 or build 37 and then my functions need only use the 
object names snps or genes, and all is fine.  But, this doesn't seem 
like a good solution.  What if a user has snps_36 and genes_37 loaded?  
What if they have an object named snps in their working environment 
called snps?  Alternatively, I could load all datasets and they could 
pass an argument "build" to my functions, but these are large datasets, 
and I don't want to use time and memory loading both versions when I 
expect any individual user to pick a single version and stick with it.

Can anyone suggest how else I might proceed?

Thank you,

Chris




-- 
JDRF/WT Diabetes & Inflammation Laboratory (DIL),
NIHR Cambridge Biomedical Research Centre,
Cambridge Institute for Medical Research,
University of Cambridge

Website:http://www-gene.cimr.cam.ac.uk/staff/wallace
DIL Website:http://www-gene.cimr.cam.ac.uk



More information about the R-devel mailing list