[R] split data into training and testing sets

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Nov 11 19:30:29 CET 2005


On Fri, 11 Nov 2005, Dhiren DSouza wrote:

> How can I split a dataset randomly into a training and testing set.  I would
> like to have the ability to specify the size of the training set and use the
> remaining data as the testing set.
>
> For example 90% training data and 10% testing data split.  Is there a
> function that will accomplish this?

Yes, see ?sample: use it to sample indices.
There are lots of examples around, e.g. in ?lda.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list