[R] Finding the rows with the duplicated index

Santosh Srinivas santosh.srinivas at gmail.com
Tue Nov 23 17:04:11 CET 2010


Hello Group,

I have a huge time series dataset with sample below. I am basically trying
to read it into a zoo object with columns 1:6 to index. Zoo issues a warning
that some of the rows have duplicated index.

dput(z)
structure(list(TrdTimestamp = structure(list(sec = c(19, 19, 
18, 10, 12, 43, 41, 59, 40, 29), min = c(58L, 57L, 39L, 37L, 
4L, 5L, 26L, 45L, 24L, 16L), hour = c(11L, 12L, 10L, 12L, 14L, 
14L, 15L, 14L, 11L, 11L), mday = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), mon = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    year = c(109L, 109L, 109L, 109L, 109L, 109L, 109L, 109L, 
    109L, 109L), wday = c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L), yday = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), isdst = c(0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("sec", "min", 
"hour", "mday", "mon", "year", "wday", "yday", "isdst"), class = c("POSIXt",

"POSIXlt")), Ticker = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = "NIFTY", class = "factor"), InstTyp = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("FUTIDX", "OPTIDX"
), class = "factor"), ExpDt = c(20090129L, 20090129L, 20090129L, 
20090129L, 20090129L, 20090129L, 20090129L, 20090129L, 20090129L, 
20090129L), OptTyp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("CE", "FF", "PE"), class = "factor"), 
    Strike = c(2700L, 2700L, 2700L, 2700L, 2700L, 2700L, 2700L, 
    2700L, 2700L, 2700L), TrdPrice = c(347.4, 340, 334.95, 335.5, 
    349.95, 353, 380, 378.1, 340.25, 339), TrdQty = c(50L, 50L, 
    50L, 50L, 50L, 50L, 50L, 50L, 50L, 50L)), .Names = c("TrdTimestamp", 
"Ticker", "InstTyp", "ExpDt", "OptTyp", "Strike", "TrdPrice", 
"TrdQty"), row.names = c(NA, 10L), class = "data.frame")



Z should ideally have a unique index based on columns 1:6 ... but looks like
some are duplicated.
I want to get the count against each unique set .. i.e. z[,1:6] and the
number of rows with that combination?

I can put the unique index sets into a different object and do a rowcount
using apply but was wondering if there is an easier way already?

Thank you.



More information about the R-help mailing list