[R] How to use ddply

Amitabh Dugar cleverchap at yahoo.com
Mon Jan 13 22:29:42 CET 2014


I have never used R-help to pose a question to the R-users community; is sending this Email the right way to do so?

I am trying to use the ddply function in the plyr package to accomplish the following:
I have a data frame of the type:

     ticker monthend_n wgtdiff    ret
156      AA   19990228  0.7172  -2.58
545    AAPL   19990228 -0.0828 -15.48
925    ABCW   19990228  0.0966  -7.36
1041   ABFS   19990228  0.1320  -8.89
1165    ABI   19990228  0.2355   4.61
1482    ABS   19990228  0.1668  -6.56
1563    ABT   19990228  0.1650  -0.27
1790   ACAT   19990228  0.1540 -13.82
2498    ACN   19990228  0.0000  12.15
2532    ACO   19990228  0.1320   8.48
2857    ACV   19990228  0.1540  -6.54
2942   ACXM   19990228  0.0000  -6.13
3303   ADCT   19990228  0.1035   1.73
3568    ADM   19990228  0.1540   0.33
4072   ADSK   19990228 -0.1035  -9.19
4672    AEH   19990228  0.1650     NA
4673   AEIC   19990228  0.1314  -6.95
4867    AEP   19990228  0.1540  -3.62
157      AA   19990331  0.1932   1.70
546    AAPL   19990331  0.0330   3.23
1005    ABF   19990331  0.1540 -20.51
1166    ABI   19990331  0.2860   8.33
1255    ABK   19990331  0.0966  -3.57
1483    ABS   19990331  0.0000  -4.50
1564    ABT   19990331  0.3955   1.08
1733    ABX   19990331  0.2340  -3.53
2533    ACO   19990331  0.0966   5.26
3304   ADCT   19990331  0.2925  17.75
3418    ADI   19990331  0.2688  18.70
3724    ADP   19990331  0.1540 -38.43
4514    AEE   19990331  0.1540  -1.31
4868    AEP   19990331 -0.0966  -4.65

I am trying to generate quintile cutoff points across the distribution of tickers for every month, using the command:
> result <- ddply(test, .(monthend_n), .fun=cut, test$wgtdiff,5)

I get the message:
Error in cut.default(piece, ...) : 'x' must be numeric

I tried creating a monthly list of data frames, extracting the wgtdiff column and passing that into the cut function, but that did not work either (as below)
pieces <- split(test,test$monthend_n)
vectors<- lapply(pieces,"[[","wgtdiff")
quintiles <- lapply(vectors,cut(vectors[1:2],5))
Error in cut.default(vectors[1:2], 5) : 'x' must be numeric

However, the cut function does the job correctly when I pass it only an individual month's data, as below:
first <- pieces[[1]]
quintiles <- cut(first$wgtdiff,5)
levels(quintiles)

What is the correct way to solve this problem?

Thanks for your help, everyone!




More information about the R-help mailing list