[R] Best and Worst values

arun smartpink111 at yahoo.com
Fri Sep 27 15:37:38 CEST 2013






Ira,
obj_name<- load("arun.RData")
Pred1<- get(obj_name[1])
Actual1<- get(obj_name[2])

dat2<- data.frame(S1=rep(Pred1[,1],ncol(Pred1)-1),variable=rep(colnames(Pred1)[-1],each=nrow(Pred1)),Predict=unlist(Pred1[,-1],use.names=FALSE),Actual=unlist(Actual1[,-1],use.names=FALSE),stringsAsFactors=FALSE)

dat2New<- dat2[!(is.na(dat2$Predict)|is.na(dat2$Actual)),]
 dat3<- dat2New[order(dat2New$S1,dat2New$Predict),]

library(plyr)

resLow<-ddply(dat3,.(S1),summarize, cbind(head(Predict,5),head(Actual,5)))
resHigh<-ddply(dat3,.(S1),summarize, cbind(head(rev(Predict),5),head(rev(Actual),5)))
 resLow1<-data.frame(Date=resLow[,1],Predict=resLow[,2][,1],Actual=resLow[,2][,2])
 resHigh1<-data.frame(Date=resHigh[,1],Predict=resHigh[,2][,1],Actual=resHigh[,2][,2])
 resHigh1$id<- 1:nrow(resHigh1)
 resLow1$id<- 1:nrow(resLow1)
resLow2<-resLow1[!resLow1[,2]>=0,]
resHigh2<- resHigh1[resHigh1[,2]>0,]
resFinal<- merge(resLow2,resHigh2,by=c("Date","id"),all=TRUE) 


resNew<- as.data.frame(matrix(0,nrow(resFinal)*2,3))
resNew[,1]<-rep(resFinal$Date,each=2)

###indexing is not that important here.  You can just ?melt() or ?reshape() from wide to long format and when you try ddply(), it will automatically arrange the data #accordingly. 


indx<-cbind(rep(seq_len(nrow(resFinal)),2),rep(c(5,3),each=250))  ## 5,3 represents the column numbers Predict in resFinal
indx2<-c(rep(seq(1,100,by=2),each=5),rep(seq(2,100,by=2),each=5))
indx3<- indx[order(indx2),]
resNew[,2]<-as.numeric(resFinal[indx3])

indx1<-cbind(rep(seq_len(nrow(resFinal)),2),rep(c(6,4),each=250)) #6,4 represent the columns Actual in resFinal
indx4<- indx1[order(indx2),]
resNew[,3]<-as.numeric(resFinal[indx4])
colnames(resNew)<- c("Date","Predict","Actual")


CorRes<-ddply(resNew,.(Date),summarize,Correl=cor(Predict,Actual,use="complete.obs"))

 head(CorRes)
#        Date    Correl
#1 2006-01-03 0.7079585
#2 2006-01-04 0.6537652
#3 2006-01-05 0.6397637
#4 2006-01-06 0.7448979
#5 2006-01-09 0.7325796
#6 2006-01-10 0.6283132




Arun

________________________________
From: Ira Sharenow <irasharenow100 at yahoo.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Thursday, September 26, 2013 11:22 PM
Subject: Re: Update, September 26, 2013



Arun,


I may want to separate the longs and do a correlation and separate the shorts and do a correlation, but the more likely scenario is to have the (possibly) 10 pairs of values per day all as part of a single correlation.

Thanks.

Ira



More information about the R-help mailing list