[R] Random assignment

John Haart another83 at me.com
Fri Oct 15 13:29:45 CEST 2010


Hi Denis and list

Thanks for this , and sorry for not providing enough information

First let me put the study into a bit more context : -

I know the number of species at risk in each family, what i am asking  is "Is risk random according to family or do certain families have a disproportionate number of at risk species?"

My idea was to randomly allocate risk to the families based on the criteria below (binomial(nspecies, 0.0748)) and then compare this to the "true data" and see if there was a significant difference.

So in answer to your questions, (assuming my method is correct !)

> Is this over all families, or within a particular family? If the former, why
> does a distinction of family matter?

Within a particular family  - this is because i am looking to see if risk in the "observed" data set is random in respect to family so this will provide the baseline to compare against.

> I guess you've stated the p, but what's the n? The number of species in each
> family?

This varies largely, for instance i have some families that are monotypic  (with 1 species) and then i have other families with 100+ species 


> Assuming you have multiple families, do you want separate simulations per
> family, or do you want to do some sort of weighting (perhaps proportional to
> size) over all families?

I am assuming i want some sort of weighting. This is because i am wanting to calculate the number of species expected to be at risk in EACH family under the random binomial distribution ( assuming every species has a 7.48% chance of being at risk.

Thanks

John




On 15 Oct 2010, at 11:19, Dennis Murphy wrote:

Hi:

I don't believe you've provided quite enough information just yet...

On Fri, Oct 15, 2010 at 2:22 AM, John Haart <another83 at me.com> wrote:

> Dear List,
> 
> I am doing some simulation in R and need basic help!
> 
> I have a list of animal families for which i know the number of species in
> each family.
> 
> I am working under the assumption that a species has a 7.48% chance of
> being at risk.
> 

Is this over all families, or within a particular family? If the former, why
does a distinction of family matter?

> 
> I want to simulate the number of species expected to be at risk under a
> random binomial distribution with 10,000 randomizations.
> 

I guess you've stated the p, but what's the n? The number of species in each
family? If you're simulating on a family by family basis, then it would seem
that a binomial(nspecies, 0.0748) distribution would be the reference.
Assuming you have multiple families, do you want separate simulations per
family, or do you want to do some sort of weighting (perhaps proportional to
size) over all families? The latter is doable, but it would require a
two-stage simulation: one to randomly select a family and then to randomly
select a species.

Dennis


> 
> I am relatively knew to this field and would greatly appreciate a
> "idiot-proof" response, I.e how should the data be entered into R? I was
> thinking of using read.table, header = T, where the table has F = Family
> Name, and SP = Number of species in that family?
> 
> John
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list