[Rd] Classification Trees and basic Random Forest pkg using t ree structures in C

Liaw, Andy andy_liaw at merck.com
Fri Nov 4 20:00:03 CET 2005


> From: Hin-Tak Leung
> 
> Izmirlian, Grant (NIH/NCI) wrote:
> <snipped>
> > The only interesting feature is that the tree structure has been
> > implemented in C. Its a neater way to carry stuff around and I am 
> > guessing would make future implementation easier.
> > 
> > Because of its inherent redundancy from the users standpoint, it
> > isn't something to send to CRAN. However, I was wondering whether
> > anyone is interested in a copy?
> 
> Hi,
> 
> Hmm, why didn't you just post a URL?

Isn't it a bit too much to assume that everyone has a personal web space
somewhere?

> Incidentally I am actually very
> interested in seeing your code. I am working on a project where
> the data set is extremely large, but the permuntation of the states of
> the data is extremely small. Each piece of data consists of only 4 
> states, so stuffing it as an R object (which takes up 32-byte? on
> 32-bit machines) or even an char vector is quite wasteful; so I
> have written a "strange" data.frame where internally it uses only
> 2-bit for storage. (it is still work-in-process but I have got to
> the point of being able to get and set each 2-bit cell now).

For some of the data we encounter, all X variables are binary, so each data
point can be encoded into a bitstring.  There are algorithms that take
advantage of that.  The problem is interfacing such code with R.  I know of
no good solutions.  As I told Grant, I thought about what he did, too, but
the difficulty is how to pass such data structures to R.  Actually, some
time down the road I might try to use the dendrogram class that's in R, and
manipulate them in C.  Not sure about efficiency though. 

Andy

 
> Hin-Tak Leung
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>



More information about the R-devel mailing list