[R] plotting single variables common to multiple data frames

Mathew Brown mathew.brown at forst.uni-goettingen.de
Wed May 25 09:48:17 CEST 2011



Hi John,
First off, thanks again for your help with this. Much appreciated.

I've attached a file of the original data (yes, as you can see there are
header names). These hour long files are zipped together on a computer
(which is actually an analyzer) and sent each morning to a server. I
then run the function below which extracts the data files and binds them
into daily files. I then save these files in the .RData format (the
'stuff' file I sent you). I agree that the way I am saving these files
must be causing the problem. I'm very new to R and this was the only way
I found to save the files, that 'seemed' to work. Please tell me if you
know a better way (I'm sure you do)!

###########Function used to extract and save######################
# this function extracts 1 hour iso data from zip files and creates
daily files

#
path = "Y:\\Data\\"
pathout = "Y:\\Daily\\"
time=Sys.time()
tt<- as.numeric(format.Date(time, "%Y%m%d"))
#tt<-"20110523" #used to manually enter in date
end = ".zip"
ind = paste(path,tt,end,sep="")
xx=unzip(ind)

#merge all files to one and split by day
iso<- c();d<- c()
for (x in xx) {
u<-read.table(x, header=TRUE, sep="", dec=".")
    u$dataset = x
    iso = rbind(iso,u)
    udate<- unique(iso$DATE)
    d=split(iso,iso$DATE)
}

#create directories and load if files exist. Then merge data from same
day and save
fname<- c()
finame<- c()
old<- c()
udate=gsub("[^0-9]","",udate)
for (i in 1:length(udate)){
#fname[i]<-  paste(pathout,udate[i],sep="")
finame[i]=paste(pathout,udate[i],".RData", sep="")
deskdir<- dir.create(pathout,showWarnings=FALSE)
    if (file.exists(finame[i])){
    old=load(finame[i])
    e3<- new.env()
    old<- get('isot', e3)
    isot = merge(old,d[i],all = TRUE)

    save(isot, file=finame[i])
    }else {
     isot=d[i]
     save(isot, file=finame[i])
    }

}
rm(list = ls(all = TRUE))
#############End####################


Once I have these daily RData files (e.g. 20110520.RData) I'd like to be
able to grab any number of them and plot them all together. I'm trying
to get this process streamlines as much as possible so I can come into
work each day and plot the data from the last week with 'a click of a
button'.

Thanks again!

Mat


On 5/24/2011 7:15 PM, John Kane wrote:
>  Whoa, more data than I needed.  I called the rdata file from your dput results 'stuff' so any commands to stuff is to that file
>
>  You say "The structure is kind of strange" and I have to agree with you.  As it stands I cannot get it to do anything  A str(stuff) command show that it is  data.frame with 8258 obs. of  38 variables. However it also says a variable called X2011.05.20.TIME which is a Factor w/ 114230 level--and this is patently nonsense.
>
>  It is almost a certainty that it is something about the code you are using to load the data or the orginal structure of the file  which is causing the problem
>
>  Simple commands like:
>  names(stuff)
>  stuff[1,1]
>  stuff[,1]dim(stuff)
>
>  are not working or returning nonsense
>
>
>  I took the file, wrote it back out of R as a csv file and read it bake in and I seem to have something I can work with but, of course, that does not mean it looks like your orginal data.  Se my code below
>
>  Some quick questions
>
>  1. What is the format of the original data files?
>
>  2. What commands are you using to read the data into R?
>  Please supply the code.
>
>  3.  Do the files actually have header names? It looks to me as if the reading in command thinks you have variable names at the top of the column but you don't and so it's using the first row of data as the variable names
>
>  Mysteps
>  #===================================================================
>  #I took the stuff file and did a write.table on it,
>  # storing the file as a text (or csv) file called mystuff
>  #===================================================
>    write.table(stuff, file="c:/rdata/mystuff.csv",
>       row.names = FALSE, sep=",",
>          col.names=FALSE   )
>  #====================================================
>
>  # I, then, read the data back into a new file "new.data"
>  #====================================================
>  new.data<- read.csv("c:/rdata/mystuff.csv",
>            sep=",", header=FALSE)
>  #====================================================
>
>  #now commands like
>    names(new.data)
>       new.data[,1]
>       dim(new.data)
>
>  # are working the way we would expect.
>  #=========================================================
>
>
>
>
>
>  names(new.data)
>       new.data[,1]
>
>
>
>
>
>
>  --- On Tue, 5/24/11, Mathew Brown<mathew.brown at forst.uni-goettingen.de>   wrote but
>
>>  From: Mathew Brown<mathew.brown at forst.uni-goettingen.de>
>>  Subject: Re: [R] plotting single variables common to multiple data frames
>>  To: "John Kane"<jrkrideau at yahoo.ca>
>>  Cc: r-help at r-project.org
>>  Received: Tuesday, May 24, 2011, 10:38 AM
>>  Here is some data. Only one day as
>>  two days were too big. The structure
>>  is kind of strange and I'm not sure how to 'grab' a single
>>  variable from
>>  it to plot. I would be happy if someone could tell me how
>>  to do that.
>>
>>  Cheers
>>
>>
>>  On 5/24/2011 3:55 PM, John Kane wrote:
>>>  None of the files came through.  The R-help list
>>  routinely srips off all attachments to cut down on the
>>  change of viri .
>>>  You could provide the data in the email by using
>>  dput.  Type ?dput in the R console to get the help
>>  page.
>>>  It is not clear from what you write whether you want a
>>  line with one set of data consisting of all the data from
>>  the seven files or if you want 7 lines (dots whatever) one
>>  for each day.
>>>  A brute force way for the first approach is to combine
>>  all the data.frames using rbind, and plot from there.
>>>  Example
>>>  =====================================================
>>>  xx<- data.frame(aa=1:10, bb = letters[1:10])
>>>  yy<- data.frame(aa=11:20,  bb =
>>  letters[11:20])
>>>  zz<- data.frame(aa=21:30, bb= sample(letters[1:
>>  26], 10))
>>>  df1<- rbind(xx,yy,zz)
>>>  plot(df1$aa)
>>>
>>>
>>>
>>  ======================================================
>>>  If you want 7 sets someone else probably has a simple
>>  solution. Without your sample data it's bit bit hard to
>>  guess.
>>>
>>>
>>>
>>>
>>>
>>>
>>>  --- On Tue, 5/24/11, Mathew Brown<mathew.brown at forst.uni-goettingen.de>
>>  wrote:
>>>>  From: Mathew Brown<mathew.brown at forst.uni-goettingen.de>
>>>>  Subject: [R] plotting single variables common to
>>  multiple data frames
>>>>  To: r-help at r-project.org
>>>>  Received: Tuesday, May 24, 2011, 8:55 AM
>>>>
>>>>
>>>>
>>>>  Hello all,
>>>>
>>>>  I have files (see attached) which are created
>>  daily. I want
>>>>  to load
>>>>  about a weeks worth of them (7 daily files) and
>>  plot a
>>>>  weeks worth of
>>>>  one variable together. So one variable name is
>>  delta_D_H. I
>>>>  would like
>>>>  to plot this variable from all 7 days on one plot.
>>  I'm
>>>>  having trouble
>>>>  figure out how to do this.
>>>>
>>>>  I've loaded them all up using this
>>>>
>>>>  time=Sys.time()
>>>>  t1<- as.numeric(format.Date(time, "%Y%m%d"))
>>>>  #date range of data to load
>>>>  paluiso= c(); yy = c();
>>>>  t1=t1-1
>>>>  t0<-t1-7
>>>>  x = t0:t1
>>>>  for (i in seq( length(x) ) ) {
>>>>
>>>>  y=load(paste(datalocation,x[i],".RData", sep=""))
>>>>           e3<- new.env()
>>>>        yy[[i]]<- get('isot',
>>  e3)
>>>>  }
>>>>
>>>>  but I don't know how to grab single variables from
>>  these
>>>>  data frames and
>>>>  plot them.
>>>>
>>>>  Any help is appreciated!
>>>>
>>>>  M


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: HBDS2137-20110517-013430-DataLog_User.dat
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110525/7b62b246/attachment.pl>


More information about the R-help mailing list