[R] LDA Precdict - Seems to be predicting on the Training Data

BostonR dpope at capitaliq.com
Tue Oct 20 16:31:32 CEST 2009


When I import a simple dataset, run LDA, and then try to use the model to
forecast out of sample data, I get a forecast for the training set not the
out of sample set.  Others have posted this question, but I do not see the
answers to their posts.

Here is some sample data:

Date	Names	v1	v2	v3	c1
1/31/2009	Name1	0.714472361	0.902552278	0.783353694	a
1/31/2009	Name2	0.512158919	0.770451596	0.111853346	a
1/31/2009	Name3	0.470693282	0.129200065	0.800973877	a
1/31/2009	Name4	0.24236898	0.472219638	0.486599763	b
1/31/2009	Name5	0.785619735	0.628511593	0.106868172	b
1/31/2009	Name6	0.718718387	0.697257275	0.690326648	b
1/31/2009	Name7	0.327331186	0.01715109	0.861421706	c
1/31/2009	Name8	0.632011743	0.599040196	0.320741634	c
1/31/2009	Name9	0.302804404	0.475166304	0.907143632	c
1/31/2009	Name10	0.545284813	0.967196462	0.945163717	a
1/31/2009	Name11	0.563720418	0.024862018	0.970685281	a
1/31/2009	Name12	0.357614427	0.417490445	0.415162276	a
1/31/2009	Name13	0.154971203	0.425227967	0.856866993	b
1/31/2009	Name14	0.935080173	0.488659307	0.194967973	a
1/31/2009	Name15	0.363069339	0.334206603	0.639795596	b
1/31/2009	Name16	0.862889297	0.821752532	0.549552875	a

Attached is the code:

myDat <-read.csv(file="f:\\Systematiq\\data\\TestData.csv",
header=TRUE,sep=",")
myData <- data.frame(myDat)

length(myDat[,1])

train <- myDat[1:10,]
outOfSample <- myDat[11:16,]
outOfSample <- (cbind(outOfSample$v1,outOfSample$v2,outOfSample$v3))
outOfSample <-data.frame(outOfSample)

length(train[,1])
length(outOfSample[,1])

fit <- lda(train$c1~train$v1+train$v2+train$v3)

forecast <- predict(fit,outOfSample)$class

length(forecast)##### I am expecting this to be same as
lengthoutOfSample[,1]), which is 6

Output:

length(forecast)##### I am expecting this to be same as
lengthoutOfSample[,1]), which is 6
[1] 10






-- 
View this message in context: http://www.nabble.com/LDA-Precdict---Seems-to-be-predicting-on-the-Training-Data-tp25976178p25976178.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list