[R] reorganizing a data frame

Jeff Miller jdm at xnet.com
Tue Jul 11 00:58:32 CEST 2000


    David,

    Thanks for this info. David James was kind enough to send me the same solution in a private mail.
    I hadn't realized that you can index into an array with a matrix, but I see 
    now that this is a very useful tool in R and S.  

    Duncan Murdoch sent me a more concise solution:

       tapply(oldstock$close,list(oldstock$date,oldstock$ticker),mean)

    which works well if the matrix "oldstock" has fewer than say 10,000 rows,
    but which starts to bog down considerably for matrices with more rows.

    The solution that you and David James sent is still quite fast for matrices 
    with 300,000 rows. 

    Thanks again to everyone for their insights. 

        Jeff Miller


  ----- Original Message ----- 
  From: Brahm, David 
  To: 'Jeff Miller' ; r-help at stat.math.ethz.ch 
  Sent: Monday, July 10, 2000 11:44 AM
  Subject: RE: [R] reorganizing a data frame


  Jeff Miller wants to turn a dataframe (stockdata) containing date, ticker, and close into a matrix (closedata).  Here's how I'd do it in S-Plus (sorry, I haven't tried this in R):
   
  dates <- sort(unique(stockdata$date))
  tickers <- sort(unique(stockdata$ticker))
  closedata <- matrix(NA, length(dates), length(tickers), dimnames=list(as.character(dates), tickers))
  idx <- cbind(match(stockdata$date, dates), match(stockdata$ticker, tickers))
  closedata[idx] <- stockdata$close
   
  The key here is knowing that you can index into a matrix (closedata) with an Nx2 matrix (idx), each row of which represents one element's coordinates.  This method is especially efficient if your matrix "closedata" is sparse.
   
  P.S. The "as.character" is there because S-Plus 5.1 allows for non-character dimnames, which seems foolish to me, and I use numbers for dates.
  -- David Brahm 
      Fidelity Investments 
      (617)563-7438 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20000710/3d7da968/attachment.html


More information about the R-help mailing list