[R] Random Cluster Generation Question

David Winsemius dwinsemius at comcast.net
Fri Apr 10 02:56:55 CEST 2009


Try:
 > clust <- rMatClust(10, 0.05, 50)
 > plot(clust)
 > Y <- rThomas(10, 0.05, 50)
 > plot(Y)


On Apr 9, 2009, at 7:28 PM, Jason L. Simms wrote:

> Hello,
>
> Thanks for your note.  I recognize that the points per cluster is
> random, and also that it is possible to set the mean number of points
> per cluster through the function.  What I was hoping was that I could
> specify a maximum number of points overall across all clusters, but
> conceptually I don't know how that could even be implemented.  I ended
> up adjusting the parameters of the function until I produced right
> around 2,000 points in a 10x10 box, and then I just multiplied
> everything by 100.  Not sure whether it's perfect, but I suspect that
> it will work for my needs currently.
>
> I'll look into the rThomas() function, too.  I am much more of an
> applied stats person, so the subtle (or even not-so-subtle)
> differences and advantages/disadvantages between a Thomas Process and
> a Matern Process are unclear to me at the moment.
>
> Jason
>
> On Thu, Apr 9, 2009 at 7:06 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>>
>> On Apr 9, 2009, at 5:01 PM, Jason L. Simms wrote:
>>
>>> Hello,
>>>
>>> I am fairly new to R, but I am not new to programming at all.  I  
>>> want
>>> to generate random clusters in a 1,000x1,000 box such that I end up
>>> with a total of about 2,000 points.  Once done, I need to export the
>>> X,Y coordinates of the points.
>>>
>>> I have looked around, and it seems that the spatstat package has  
>>> what
>>> I need.  The rMatClust() function can generate random clusters,  
>>> but I
>>> have run into some problems.
>>>
>>> First, I can't seem to specify that I want x number of points.
>>
>> The number of points per cluster IS random.
>>
>>> So, right now it appears that if I want around 2,000 total points  
>>> that I
>>> must play around with the parameters of the function (e.g., mean
>>> number of points per cluster, cluster radius, etc.) until I end up
>>> with roughly 2,000 points.
>>>
>>> More problematic, however, is that specifying a 1,000x1,000 box is  
>>> too
>>> much to handle.  I have been running the following function for over
>>> 24 hours straight on a decent computer and it has not stopped yet:
>>>
>>> clust <- rMatClust(1, 50, 5, win=owin(c(0,1000),c(0,1000)))
>>
>> It might well be due to the 1000 x 1000 dimensions but it is  
>> because of your
>> parameters. It took a significant amount of time to yield 4-10  
>> points on a 1
>> x 1 window. Whereas this particular invocation much more quickly  
>> produced
>> 2707 points with a mean of 100 points per uniform cluster within a  
>> 1 x 1
>> square:
>>
>> Y <- rMatClust(20, 0.05, 100)
>>
>> If you wanted the x and y dimensions to be in the range of 0-1000,   
>> couldn't
>> you just multiply the x and y values inside Y by 1000.
>>  Y$x <- 1000*Y$x
>>  Y$y <- 1000*Y$y
>>  plot(Y) # cannot see any points, probably because the plot.kkpm  
>> method is
>> using
>> # internal ranges inside that Y object. So you might loose the  
>> ability to
>> use
>> # other functions in that package
>>  plot(Y$x, Y$y)  # as expected and took seconds at most.
>>
>> I would think that the most important task would be deciding on the  
>> function
>> that controls the intensity process of the "offspring points". The  
>> points in
>> this simple example clearly violate my notions of randomness  
>> because of the
>> sharp edges at the cluster boundaries. So, you may want to examine
>> rThomas(...) in the same package.
>>
>> There is, of course, a SIG spatial stats mailing list full of  
>> people better
>> qualified than I on such questions.
>>>
>>> Clearly, I need to rethink my strategy.  Could I generate the points
>>> in a 10x10 box with a radius of .5 and then multiply out the  
>>> resulting
>>> point coordinates by 100?  Is there another package that might  
>>> meet my
>>> needs better than spatstat for easy cluster generation?
>>>
>>> Any suggestions are appreciated.
>>> --
>>> Jason L. Simms, M.A.
>>> USF Graduate Multidisciplinary Scholar
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>
>
>
> -- 
> Jason L. Simms, M.A.
> USF Graduate Multidisciplinary Scholar
> Co-President, Graduate Assistants United
> Ph.D. / M.P.H. Student
> Departments of Anthropology and Environmental and Occupational Health
> University of South Florida

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list