[R] Programcode and data in the same textfile

Hotz, T. th50 at leicester.ac.uk
Thu Jun 12 15:18:09 CEST 2003


Following up on Thorsten's solution, this one doesn't need a tempfile:

my.data<-read.table(textConnection(c(

      ### here comes my data

      "Sex    Respons",
      "Male   1",
      "Male   2",
      "Female 3",
      "Female 4"

      ### end of data input

)),header=T)

print(my.data)

HTH

Thomas


> -----Original Message-----
> From: Torsten Hothorn [mailto:Torsten.Hothorn at rzmail.uni-erlangen.de]
> Sent: 12 June 2003 14:00
> To: Ernst Hansen
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Programcode and data in the same textfile
> 
> 
> > I have the following problem.  It is not of earthshaking importance,
> > but still I have spent a considerable amount of time thinking about
> > it.
> >
> > PROBLEM: Is there any way I can have a single textfile that contains
> > both
> >
> >   a) data
> >
> >   b) programcode
> >
> > The program should act on the data, if the textfile is source()'ed
> > into R.
> >
> >
> > BOUNDARY CONDITION: I want the data written in the textfile 
> in exactly
> > the same format as I would use, if I had data in a separate 
> textfile,
> > to be read by read.table().  That is, with 'horizontal 
> inhomogeneity'
> > and 'vertical homogeneity' in the type of entries.  I want to write
> > something like
> >
> >       Sex    Respons
> >       Male   1
> >       Male   2
> >       Female 3
> >       Female 4
> >
> 
> 
> something like
> 
>   tmpfilename <- tempfile()
>   tmpfile <- file(tmpfilename, "w")
>   cat(
> 
>       ### here comes my data
> 
>       "Sex    Respons",
>       "Male   1",
>       "Male   2",
>       "Female 3",
>       "Female 4",
> 
>       ### end of data input
> 
>       file = tmpfile, sep="\n")
>   close(tmpfile)
>   read.table(tmpfilename, header = TRUE)
> 
> 
> best,
> 
> Torsten
> 
> > In effect, I am asking if there is some way I can convince
> > read.table(), that the data is contained in the following n lines of
> > text.
> >
> >
> > ILLEGAL SOLUTIONS:
> > I know I can simulate the behaviour by reading the columns of the
> > dataframe one by one, and using data.frame() to glue them together.
> > Like in
> >
> >     data.frame(Sex = c('Male', 'Male', 'Female', 'Female'),
> >                Respons = c(1, 2, 3, 4))
> >
> > I do not like this solution, because it represents the data in a
> > "transposed" way in the textfile, and this transposition makes the
> > structure of the dataframe less transparent - at least to me. It
> > becomes even less comprehensible if the Sex-factor above is written
> > with the help of rep() or gl() or the like.
> >
> > I know I can make read.table() read from stdin, so I could type the
> > dataframe at the prompt.  That is against the spirit of the problem,
> > as I describe below.
> >
> >
> > I know I can make read.table() do the job, if I split the 
> data and the
> > programcode in to different files.  But as the purpose of 
> the exercise
> > is to distribute the data and the code to other people, splitting
> > into several files is a complication.
> >
> >
> > MOTIVATION: I frequently find myself distributing small 
> chunks of code
> > to my students, along with data on which the code can work.
> >
> > As an example, I might want to demonstrate how model.matrix() treats
> > interactions, in a certain setting.  For that I need a 
> dataframe that
> > is complex enough to exhibit the behaviour I want, but 
> still so small
> > that the model.matrix is easily understood.  So I make such a
> > dataframe.
> >
> > I am trying to distribute this dataframe along with my 
> code, in a way
> > that is as simple as possible to USE for the students (hence the
> > one-file boundary condition) and to READ (hence the 
> non-transposition
> > boundary condition).
> >
> >
> >
> > Does anybody have any ideas?
> >
> >
> > Ernst Hansen
> > Department of Statistics
> > University of Copenhagen
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

---

Thomas Hotz
Research Associate in Medical Statistics
University of Leicester
United Kingdom

Department of Epidemiology and Public Health
22-28 Princess Road West
Leicester
LE1 6TP
Tel +44 116 252-5410
Fax +44 116 252-5423

Division of Medicine for the Elderly
Department of Medicine
The Glenfield Hospital
Leicester
LE3 9QP
Tel +44 116 256-3643
Fax +44 116 232-2976




More information about the R-help mailing list