[R] regarding Fselector package in R

Ranjana Girish ranjanagirish30 at gmail.com
Sat Aug 13 12:38:42 CEST 2016


I need to calculate information gain using Fselector package for feature
selection ti classify document
i executed the code below

library(tm)
library(NLP)
 library(FSelector)
doc<-c( "The sky is blue.", "The sun is bright today.",
          "The sun in the sky is bright.","We can see the shining sun, the
bright sun.")
doc_corpus <- Corpus( VectorSource( doc ) )
control_list <- list( removePunctuation = TRUE, stopwords = TRUE, tolower =
TRUE )
tdm <- TermDocumentMatrix( doc_corpus, control = control_list )
( tf <- as.matrix(tdm ) )
tf
tf1<-t(tf)
tfdataframe<-data.frame(tf1)
tfdataframe
tfdataframe$doc<-c("1","2","3","4")
tfdataframe
#information gain based on term frequency
infgain <- information.gain(doc~.,tfdataframe )
infgain

and i got output

> infgain
        attr_importance
blue          0.0000000
bright        0.0000000
can           0.0000000
see           0.0000000
shining       0.0000000
sky           0.6931472
sun           0.0000000
today         0.0000000
>
is this output is logically correct??

I am totally confused!!!!!!!!!

could anyone help me please

Thanks in advance

	[[alternative HTML version deleted]]



More information about the R-help mailing list