[R] Problems with Merge

Patnaik, Tirthankar tirthankar.patnaik at citi.com
Tue Jun 5 15:02:54 CEST 2007


Prof. Ripley,
	Date is a column in a matrix, which should be ok for the merge.
Actually the IMHO problem was with the default sorting option, I set
sort=FALSE, and the merge happened as it should. 

Best,
-Tir

Tirthankar Patnaik
Analyst, India Strategy
Citigroup Investment Research
+91-22-6631 9887


P.S.  Agree with you on the trailer of the message--I had to post the
message in a hurry, though! 

> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
> Sent: Tuesday, June 05, 2007 6:08 PM
> To: Patnaik, Tirthankar [GWM-CIR]
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Problems with Merge
> 
> Take a look at the help for merge(): in all the examples by.x 
> is a character string, not a one-column data frame which is 
> what rhistory["Date"] would appear to be.
> 
> Please note the trailer of this messsage.
> 
> On Tue, 5 Jun 2007, Patnaik, Tirthankar  wrote:
> 
> > Hi,
> > I have a history dataset, a matrix with about 1590 obs, and 
> 242 cols, 
> > and I need to update this matrix with an 'update' matrix that has 
> > about 30 rows, and roughly similar number of columns as the 
> history ds 
> > (but not necessarily equal). The update dataset is read 
> from an Excel 
> > ODBC connection. When I try and merge these datasets, I get 
> > counter-intuitive results.
> >
> > library(RODBC)
> >   chn <- odbcConnectExcel(UpdateFile)
> >   sqlTables(chn)
> >   UpdateData <- sqlFetch(chn,"MCap243")
> >   colnames(UpdateData) <- gsub("#",".",colnames(UpdateData))
> >   close(chn)
> >   # specify just how many rows we need from the Update 
> file. We'd only 
> > read five
> >   # rows at a time.
> >   UpdateRows = 20
> >   UpdateData <- UpdateData[1:UpdateRows,]
> >
> >   # Delete Unwanted stocks.
> >   UpdateData <- UpdateData[,!names(UpdateData) %in% ToBeDeleted]
> >   x <- tail(UpdateData[c("Date","ABAN.BO")],n=50)
> >   print(x)
> >
> >
> > Gives x as:
> >
> >         Date  ABAN.BO
> > 1  2007-04-30 96448.40
> > 2  2007-05-01 96448.40
> > 3  2007-05-02 96448.40
> > 4  2007-05-03 96300.44
> > 5  2007-05-04 93718.52
> > 6  2007-05-05 93718.52
> > 7  2007-05-06 93718.52
> > 8  2007-05-07 92743.82
> > 9  2007-05-08 90374.60
> > 10 2007-05-09 89126.18
> > 11 2007-05-10 87082.47
> > 12 2007-05-11 85493.73
> > 13 2007-05-12 85493.73
> > 14 2007-05-13 85493.73
> > 15 2007-05-14 85033.21
> > 16 2007-05-15 89209.41
> > 17 2007-05-16 89089.19
> > 18 2007-05-17 90472.62
> > 19 2007-05-18 90326.51
> > 20 2007-05-19 90326.51
> >
> > But when I merge this file with the history dataset, I get 
> the dates 
> > misaligned by one row.
> >
> > whistory <- 
> merge(rhistory,UpdateData,by.x=rhistory["Date"],all=TRUE)
> >
> > tail(WHist4[c("Date","ABAN.BO")],n=30)
> >
> >          Date  ABAN.BO
> > 1581 2007-04-19 83632.60
> > 1582 2007-04-20 85942.00
> > 1583 2007-04-23 88244.00
> > 1584 2007-04-24 90309.50
> > 1585 2007-04-25 92048.00
> > 1586 2007-04-26 92051.70
> > 1587 2007-04-27 95863.10
> > 1588 2007-04-29 96448.40
> > 1589 2007-04-30 96448.40
> > 1590 2007-04-30 96343.40
> > 1591 2007-05-01 96448.40
> > 1592 2007-05-02 96300.44
> > 1593 2007-05-03 93718.52
> > 1594 2007-05-03 96195.60
> > 1595 2007-05-04 93718.52
> > 1596 2007-05-04 93616.50
> > 1597 2007-05-05 93718.52
> > 1598 2007-05-06 92743.82
> > 1599 2007-05-07 90374.60
> > 1600 2007-05-08 89126.18
> > 1601 2007-05-09 87082.47
> > 1602 2007-05-10 85493.73
> > 1603 2007-05-11 85493.73
> > 1604 2007-05-12 85493.73
> > 1605 2007-05-13 85033.21
> > 1606 2007-05-14 89209.41
> > 1607 2007-05-15 89089.19
> > 1608 2007-05-16 90472.62
> > 1609 2007-05-17 90326.51
> > 1610 2007-05-18 90326.51
> >
> > Any reasons why the dates are shifted by one date? Am I 
> missing some 
> > parameters in the merge statement?
> >
> > TIA and best,
> > -Tir
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



More information about the R-help mailing list