[R] How to import sensitive data when multiple users collaborate on R-script?

John McKown john.archie.mckown at gmail.com
Tue May 31 15:19:31 CEST 2016


On Tue, May 31, 2016 at 5:44 AM, Nikolai Stenfors <
nikolai.stenfors at gapps.umu.se> wrote:

> We conduct medical research and our datafiles therefore contain sensitive
> data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket,
> GitHub).
> When we collaborate on a r-analysis-script, we stumble upon the following
> annoyance. Researcher 1 has a line in the script importing the sensitive
> data from his/her personal computer. Researcher 2 has to put an additional
> line importing the data from his/her personal computer. Thus, we have lines
> in the script that are unnecessery for one or the other researcher. How can
> we avoid this? Is there another way of conducting the collaboration. Other
> workflow?
>
> I'm perhaps looking for something like:
> "If the script is run on researcher 1 computer, load file from this
> directory. If the script is run on researcher 2 computer, load data from
> that directory".
>
> Example:
> ## Import data-------------------------------------
> # Researcher 1 import data from laptop1, unnecessery line for Researcher 2
> data <- read.table("/path/to_researcher1_computer/sensitive_data.csv")
>
> # Researcher 2 import data from laptop2 (unnecessery line for Researcher 1)
> data <- read.table("/path/to_researcher2_computer/sensitive_data.csv")
>
> ## Clean data
> data$var1 <- NULL
>
> ## Analyze data
> boxplot(data$var2)
>
>
​Can you have the researchers input the name of the data file to be
analyzed? I use code similar to:

arguments <- commandArgs(trailingOnly=TRUE);
#
# I put in the next command due to my own ignorance
# If you invoke an R script file using just R, you
# need to say something like:
# R BATCH CMD script.R --args ... other arguments ...
#
# but if you use Rscript, you invoke it like:
# Rscript script.R ... other arguments ...
#
# Well, I got confused and did:
# Rscript script.R --args ... other arguments ...
#
# The next line adjusts for my own idiocy.
if ("--args" == arguments[1]) arguments <- arguments[-1];
#
for (file in arguments) {
...
}

Please ignore the line about my own idiocy :-}

Another thought is to use an environment variable which is set in the
user's logon profile (or the Windows registry, forgive my ignorance of
Windows). I think this would be something like:

filename <- Sys.getenv("FILENAME")
if (filename = "") {
... no file name in environment, what to do?
}

You could have someone do this for the user, if he is not familiar with ​
the process.
​


-- 
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

	[[alternative HTML version deleted]]



More information about the R-help mailing list