[R] Problems with Merge

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Jun 5 14:38:24 CEST 2007


Take a look at the help for merge(): in all the examples by.x is a 
character string, not a one-column data frame which is what 
rhistory["Date"] would appear to be.

Please note the trailer of this messsage.

On Tue, 5 Jun 2007, Patnaik, Tirthankar  wrote:

> Hi,
> I have a history dataset, a matrix with about 1590 obs, and 242 cols,
> and I need to update this matrix with an 'update' matrix that has about
> 30 rows, and roughly similar number of columns as the history ds (but
> not necessarily equal). The update dataset is read from an Excel ODBC
> connection. When I try and merge these datasets, I get counter-intuitive
> results.
>
> library(RODBC)
>   chn <- odbcConnectExcel(UpdateFile)
>   sqlTables(chn)
>   UpdateData <- sqlFetch(chn,"MCap243")
>   colnames(UpdateData) <- gsub("#",".",colnames(UpdateData))
>   close(chn)
>   # specify just how many rows we need from the Update file. We'd only
> read five
>   # rows at a time.
>   UpdateRows = 20
>   UpdateData <- UpdateData[1:UpdateRows,]
>
>   # Delete Unwanted stocks.
>   UpdateData <- UpdateData[,!names(UpdateData) %in% ToBeDeleted]
>   x <- tail(UpdateData[c("Date","ABAN.BO")],n=50)
>   print(x)
>
>
> Gives x as:
>
>         Date  ABAN.BO
> 1  2007-04-30 96448.40
> 2  2007-05-01 96448.40
> 3  2007-05-02 96448.40
> 4  2007-05-03 96300.44
> 5  2007-05-04 93718.52
> 6  2007-05-05 93718.52
> 7  2007-05-06 93718.52
> 8  2007-05-07 92743.82
> 9  2007-05-08 90374.60
> 10 2007-05-09 89126.18
> 11 2007-05-10 87082.47
> 12 2007-05-11 85493.73
> 13 2007-05-12 85493.73
> 14 2007-05-13 85493.73
> 15 2007-05-14 85033.21
> 16 2007-05-15 89209.41
> 17 2007-05-16 89089.19
> 18 2007-05-17 90472.62
> 19 2007-05-18 90326.51
> 20 2007-05-19 90326.51
>
> But when I merge this file with the history dataset, I get the dates
> misaligned by one row.
>
> whistory <- merge(rhistory,UpdateData,by.x=rhistory["Date"],all=TRUE)
>
> tail(WHist4[c("Date","ABAN.BO")],n=30)
>
>          Date  ABAN.BO
> 1581 2007-04-19 83632.60
> 1582 2007-04-20 85942.00
> 1583 2007-04-23 88244.00
> 1584 2007-04-24 90309.50
> 1585 2007-04-25 92048.00
> 1586 2007-04-26 92051.70
> 1587 2007-04-27 95863.10
> 1588 2007-04-29 96448.40
> 1589 2007-04-30 96448.40
> 1590 2007-04-30 96343.40
> 1591 2007-05-01 96448.40
> 1592 2007-05-02 96300.44
> 1593 2007-05-03 93718.52
> 1594 2007-05-03 96195.60
> 1595 2007-05-04 93718.52
> 1596 2007-05-04 93616.50
> 1597 2007-05-05 93718.52
> 1598 2007-05-06 92743.82
> 1599 2007-05-07 90374.60
> 1600 2007-05-08 89126.18
> 1601 2007-05-09 87082.47
> 1602 2007-05-10 85493.73
> 1603 2007-05-11 85493.73
> 1604 2007-05-12 85493.73
> 1605 2007-05-13 85033.21
> 1606 2007-05-14 89209.41
> 1607 2007-05-15 89089.19
> 1608 2007-05-16 90472.62
> 1609 2007-05-17 90326.51
> 1610 2007-05-18 90326.51
>
> Any reasons why the dates are shifted by one date? Am I missing some
> parameters in the merge statement?
>
> TIA and best,
> -Tir
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list