[R] R question: generating data using MASS

Ben Bolker bbolker at gmail.com
Mon Aug 29 20:56:37 CEST 2011


Michael Parent <michael.parent <at> ufl.edu> writes:

> 
> Thanks!
> 
> "This problem isn't uniquely defined.  Are you 
> willing to generate more samples than you need and then throw
> away extreme values?  Or do you want to 'censor'
> extreme values (i.e. set values <= 1 to 1 and values >=7 to 7)?"
> 
> I'd like the retain a normal distribution so I wouldn't want to
> delete the other values or truncate them. Can
> I use the cut command on the data that gets generated and 
> retain a normal(ish, at least) distribution?

  I don't quite understand how 'cut' (which transforms a continuous variable
into a categorical one) is going to help ... by definition,
a normal distribution is continuous (so discretizing the distribution
will make it non-normal) and has the real numbers as its domain
(so in theory you can't have a restricted domain and still have it
be normal).  If your standard deviation is small enough (say
mean=3.5 and sd=0.1) then you will never have to worry about
values beyond (1,7) in the lifetime of the universe, but if
your sd is larger (and you can't allow it to be smaller) then
you have to do *something* with the values that get generated
outside your chosen bounds ...

 [snip to make Gmane happy]



More information about the R-help mailing list