[R] difference between createPartition and createfold functions

Steve Lianoglou mailinglist.honeypot at gmail.com
Sun Oct 2 22:00:58 CEST 2011


Hi,

On Sun, Oct 2, 2011 at 3:54 PM,  <bby2103 at columbia.edu> wrote:
> Hi Steve,
>
> Thanks for the note. I did try the example and the result didn't make sense
> to me. For splitting a vector, what you describe is a big difference btw
> them. For splitting a dataframe, I now wonder if these 2 functions are the
> wrong choices. They seem to split the columns, at least in the few things I
> tried.

Sorry, I'm a bit confused now as to what you are after.

You don't pass in a data.frame into any of the
createFolds/DataPartition functions from the caret package.

You pass in a *vector* of labels, and these functions tells you which
indices into the vector to use as examples to hold out (or keep
(depending on the value you pass in for the `returnTrain` argument))
between each fold/partition of your learning scenario (eg. cross
validation with createFolds).

You would then use these indices to keep (remove) the rows of a
data.frame, if that is how you are storing your examples.

Does that make sense?

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list