[R] Caret Internal Data Representation

Bert Gunter bgunter.4567 at gmail.com
Thu Nov 5 19:10:57 CET 2015


I am not familiar with caret/Cubist, but assuming they follow the
usual R procedures that encode categorical factors for conditional
fitting, you need to do some homework on your own by reading up on the
use of contrasts in regression.

See ?factor and ?contrasts (and other linked Help as necessary) to see
what are R's usual procedures, but you will undoubtedly need to
consult outside statistical references -- the help files will point
you to some -- to fully understand what's going on. It is not trivial.

Cheers,
Bert
Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Thu, Nov 5, 2015 at 9:38 AM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> Dear All,
> I have a data set which contains both categorical and numerical
> variables which I analyze using Cubist+the caret framework.
> Now, from the generated rules, it is clear that cubist does something
> to the categorical variables and probably uses some dummy coding for
> them.
> However, I cannot right now access the data the way it is transformed
> by cubist.
> If caret (or the package) need to do some dummy coding of the factors,
> how can I access the newly encoded data set?
> I suppose this applies to plenty of other packages.
> Any suggestion is welcome.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list