> Good evening,
> I'm encountering a different kind of discretization with respect to the
> 1997 Liu and Setiono's one descripted in their papers, using Chi2 algorithm
> for feature selection with discretization.
> As stated in R documentation (discretization - R (from CRAN)
> <https://cran.r-project.org/web/packages/discretization/discretization.pdf>),
> R package discretizion offers the function Chi2, which comes to life in the
> following papers:
> Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization
> of numeric attributes, Tools with Artificial Intelligence, 388–391.
> Liu, H. and Setiono, R. (1997). Feature selection and discretization, IEEE
> transactions on knowledge and data engineering, Vol.9, no.4, 642–645.
> I wrote the following R programming language code, in which I have set
> alpha and delta equal to the ones set in the papers above. Finally, the
> following code prints out the discretized dataframe. I used Iris dataframe,
> as in one of the examples in the two papers. The first paper above states
> that alfa = 0.5 and delta = 5%, and that "the originally odd numbered data
> are selected for training (75 patterns) and rest for testing (75
> patterns)". With this asset, Sepal attributes should be removed.
> library(discretization)
> data(iris)
> df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
>    if(i %% 2 != 0){
>        df1 <- rbind(df1, iris[i,])
>    }}
> chi2(df1, alp=0.5, del=0.05)$Disc.data
> The point is that, observing the dataframe printed out by the last
> instruction, you can see that no attribute is removed. The discretized data
> frame still have 4 attributes discretized: if I correctly understood the
> above papers, Sepal Length and Sepal Width should have been both
> discretized in just one interval by Chi2 algorithm.
> I have posted a question here: http://stats.stackexchange.com/questions/
> 247499/why-does-not-r-chi2-algorithm-discretize-in-the-
> same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.
> Moreover, it's really hard to understand the cut points that Chi2 algorithm
> implemented in R makes. For example:
> res <- chi2(iris, 0.5, 0.05)
> cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
> res$Disc.data$Sepal.Length
> Help me understand, please
> Best regards
