[R] random sampling inside a dataset

Duncan Murdoch murdoch.duncan at gmail.com
Thu Oct 4 12:17:16 CEST 2012


On 12-10-04 6:06 AM, Gian Maria Niccolò Benucci wrote:
> Hi again to everybody,
>
> I incountered the following error when I try to make a sample inside a
> dataset.
> My code is:
>
>> data_ostrya <- sample(ostrya,200, replace=F)
> Error in `[.data.frame`(x, .Internal(sample(length(x), size, replace,  :
>    cannot take a sample larger than the population when 'replace = FALSE'
>
> Why it does not work?
> The whole dataset is composed of 536 rows and I just want to sample
> randomly 200 of them...

The sample() works on vectors, not dataframes.  Since dataframes are 
lists containing the columns, it was trying to sample columns, not rows.

This would give you a sample of rows:

ostrya[sample(1:536, 200, replace=FALSE),]

Duncan Murdoch

>
> Thank you in advance,
>
> Gian
>
>
>
>
> On 13 September 2012 14:01, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
>
>> On 12-09-13 7:43 AM, Gian Maria Niccolò Benucci wrote:
>>
>>> Thank you Duncan,
>>>
>>> I got the result of sampling, but it gave me only the row numbers. Is it
>>> possible to have the entire row with variables and other information?
>>> Because I need to re-sample inside my matrix the whole rows in reason to
>>> have 20 samples (i.e., rows) each year.
>>> Thank you for your invaluable help!
>>>
>>
>> Use those row numbers to index the dataframe or matrix, e.g.
>>
>> a[rows,]
>>
>> Duncan Murdoch
>>
>>
>>> Gian
>>>
>>>
>>> On 13 September 2012 13:32, Duncan Murdoch <murdoch.duncan at gmail.com>
>>> wrote:
>>>
>>>   On 12-09-13 7:18 AM, Gian Maria Niccolň Benucci wrote:
>>>>
>>>>   Thank you very much for your help,
>>>>>
>>>>> I was wondering if is possible to sample randomly specifying to select
>>>>> in
>>>>> a
>>>>> particular group of data inside the matrix, for example only within the
>>>>> whole samples collected in 2011 I would randomly choose 20 random
>>>>> samples...
>>>>>
>>>>>
>>>> You need two steps:  find the rows that meet your condition, then sample
>>>> from those.  For example,
>>>>
>>>> rows <- which( a$year == 2011 )
>>>> sample(rows, 20)
>>>>
>>>> There is one thing to watch out for:  if you have a condition that only
>>>> matches one row, you will get unexpected results here, because the sample
>>>> will be taken from 1:rows.  See the examples in ?sample for the
>>>> workaround
>>>> that uses sample.int.
>>>>
>>>> Duncan Murdoch
>>>>
>>>>
>>>>   Thanks a again,
>>>>>
>>>>>
>>>>> Gian
>>>>>
>>>>> On 13 September 2012 12:26, anna freni sterrantino <annafreni at yahoo.it
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>
>>>>>    Hello Gian,
>>>>>
>>>>>> sure sample function
>>>>>> will do it for your sampling.
>>>>>>
>>>>>> a=as.data.frame(matrix(1:20,4)****)
>>>>>>
>>>>>> sample(rownames(a),2)
>>>>>>
>>>>>> see ?sample for more details.
>>>>>> Hope it helps
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> Anna
>>>>>>
>>>>>>
>>>>>> Anna Freni Sterrantino
>>>>>> Department of Statistics
>>>>>> University of Bologna, Italy
>>>>>> via Belle Arti 41, 40124 BO.
>>>>>>      ------------------------------
>>>>>> *Da:* Gian Maria Niccolň Benucci <gian.benucci at gmail.com>
>>>>>> *A:* r-help at r-project.org
>>>>>> *Inviato:* GiovedÄ› 13 Settembre 2012 10:42
>>>>>> *Oggetto:* [R] random sampling inside a dataset
>>>>>>
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am wondering if do exist a function in R that allow me to sample or
>>>>>> choose randomly the rows (i.e., samples) inside a given matrix.
>>>>>> Thank you very much in advance.
>>>>>> Cheers,
>>>>>>
>>>>>> --
>>>>>> Gian
>>>>>>
>>>>>>        [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________****________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/****listinfo/r-help<https://stat.ethz.ch/mailman/**listinfo/r-help>
>>>>>> <https://stat.**ethz.ch/mailman/listinfo/r-**help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>>>>>>
>>>>>>
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/****posting-guide.html<http://www.R-project.org/**posting-guide.html>
>>>>>> <http://www.**R-project.org/posting-guide.**html<http://www.R-project.org/posting-guide.html>
>>>>>>>
>>>>>> <http://www.**r-project.org/**posting-guide.**html<http://r-project.org/posting-guide.**html>
>>>>>> <http://**www.r-project.org/posting-**guide.html<http://www.r-project.org/posting-guide.html>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> This body part will be downloaded on demand.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
> 	[[alternative HTML version deleted]]
>
>
>
> This body part will be downloaded on demand.
>




More information about the R-help mailing list