[R] R: Securities earning covariance

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jun 6 15:05:56 CEST 2008


Update your version of zoo to the latest one.

On Fri, Jun 6, 2008 at 3:18 AM,  <ANGELO.LINARDI at bancaditalia.it> wrote:
> Thank you for your very fast response.
> I just tried to use the zoo package, after having read the vignettes, but I get this error message:
>
> Warning messages:
> 1: In x$DAY : $ operator is invalid for atomic vectors, returning NULL
> 2: In x$EARNINGS :
>  $ operator is invalid for atomic vectors, returning NULL
> 3: In x$DAY : $ operator is invalid for atomic vectors, returning NULL
> 4: In x$EARNINGS :
>  $ operator is invalid for atomic vectors, returning NULL
> 5: In x$DAY : $ operator is invalid for atomic vectors, returning NULL
> 6: In x$EARNINGS :
>  $ operator is invalid for atomic vectors, returning NULL
>
> Am I missing something ?
>
> Thank you again
>
> Angelo Linardi
>
>
> -----Messaggio originale-----
> Da: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> Inviato: giovedì 5 giugno 2008 17.55
> A: LINARDI ANGELO
> Cc: r-help at r-project.org
> Oggetto: Re: [R] Securities earning covariance
>
> Check out the three vignettes (i.e. pdf documents in the zoo package). e.g.
>
>
> Lines <- "SEC_ID          DAY             EARNING
> IT0000001       20070101        5.467
> IT0000001       20070102        5.456
> IT0000001       20070103        4.954
> IT0000001       20070104        3.456
> IT0000002       20070101        1.456
> IT0000002       20070102        1.345
> IT0000002       20070103        1.233
> IT0000003       20070101        0.345
> IT0000003       20070102        0.367
> IT0000003       20070103        0.319
> "
> DF <- read.table(textConnection(Lines), header = TRUE) DFs <- split(DF, DF$SEC_ID)
>
> library(zoo)
> f <- function(DF.) zoo(DF.$EARNING, as.Date(format(DF.$DAY), "%Y%m%d")) z <- do.call(merge, lapply(DFs, f))
> cov(z) # uses n-1
>
>
> On Thu, Jun 5, 2008 at 11:41 AM,  <ANGELO.LINARDI at bancaditalia.it> wrote:
>> Good morning,
>>
>> I am a new R user and I am trying to learn how to use it.
>> I am trying to solve this problem.
>> I have a dataframe df of daily securities (for a year) earnings as
>> follows:
>>
>> SEC_ID          DAY             EARNING
>> IT0000001       20070101        5.467
>> IT0000001       20070102        5.456
>> IT0000001       20070103        4.954
>> IT0000001       20070104        3.456
>>                    ..........................
>> IT0000002       20070101        1.456
>> IT0000002       20070102        1.345
>> IT0000002       20070103        1.233
>>                ..........................
>> IT0000003       20070101        0.345
>> IT0000003       20070102        0.367
>> IT0000003       20070103        0.319
>>                ..........................
>>
>> And so on: about 800 different SEC_ID and about 180000 rows.
>> I have to calculate the "covariance" for each couple of securities x
>> and y according to the formula:
>>
>> Cov(x,y) = (sum[(x-x')*(y-y')]/N)/(sx*sy)
>>
>> being x' and y' the mean of securities earning in the year, N the
>> number of observations, sx and sy the standard deviation of x and y.
>> To do this I could build a df2 data frame like this:
>>
>> DAY             SEC_ID.x        SEC_ID.y        EARNING.x
>> EARNING.y       x'      y'      sx      sy
>> 20070101        IT0000001       IT0000002       5.467           1.456
>> a       b       aa      bb
>> 20070101        IT0000001       IT0000003       5.467           0.345
>> a       c       aa      cc
>> 20070101        IT0000002       IT0000003       1.456           0.345
>> b       c       bb      cc
>> 20070102        IT0000001       IT0000002       5.456           1.345
>> a       b       aa      bb
>> 20070102        IT0000001       IT0000003       5.456           0.367
>> a       c       aa      cc
>> 20070102        IT0000002       IT0000003       1.345           0.367
>> b       c       bb      cc
>> ........................................................................
>> .......................................................
>>
>> (merging df with itself with a condition SEC_ID.x < SEC_ID.y) and then
>> easily calculate the formula; but the dimensions are too big (the
>> process stops whit an out-of-memory message).
>> Besides partitioning the input and using a loop, are there any smarter
>> solutions (eventually using split and other ways of "subgroup merging"
>> to solve the problem ?
>> Are there any "shortcuts" using statistical built-in functions (e.g.
>> cov, vcov) ?
>> Thank you in advance
>>
>> Angelo Linardi
>>
>>
>>
>> ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona
>> fede e non comportano alcun vincolo ne' creano obblighi per la Banca
>> stessa, salvo che cio' non sia espressamente previsto da un accordo scritto.
>> Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore,
>> La preghiamo di comunicarne via e-mail la ricezione al mittente e di
>> distruggerne il contenuto. La informiamo inoltre che l'utilizzo non
>> autorizzato del messaggio o dei suoi allegati potrebbe costituire reato. Grazie per la collaborazione.
>> -- E-mails from the Bank of Italy are sent in good faith but they are
>> neither binding on the Bank nor to be understood as creating any
>> obligation on its part except where provided for in a written
>> agreement. This e-mail is confidential. If you have received it by mistake, please inform the sender by reply e-mail and delete it from your system.
>> Please also note that the unauthorized disclosure or use of the
>> message or any attachments could be an offence. Thank you for your
>> cooperation. **
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non
> comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che cio' non
> sia espressamente previsto da un accordo scritto.
> Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La preghiamo di
> comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. La
> informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi allegati
> potrebbe costituire reato. Grazie per la collaborazione.
> -- E-mails from the Bank of Italy are sent in good faith but they are neither binding on
> the Bank nor to be understood as creating any obligation on its part except where
> provided for in a written agreement. This e-mail is confidential. If you have received it
> by mistake, please inform the sender by reply e-mail and delete it from your system.
> Please also note that the unauthorized disclosure or use of the message or any
> attachments could be an offence. Thank you for your cooperation. **
>



More information about the R-help mailing list