[R] Random resampling of columns in species association matrices

Thu May 10 15:46:15 CEST 2012

Hi David,

Thank you for your suggestions. I am quite the beginner at R and don’t
understand how to actually implement your suggestion and am hoping for some
further advice on that, if possible.

This is a subset of my data. Rows are host species, and columns parasite
species. Three of the parasites are generalists, but P4L is a strict
specialist on FORCOL (27 individuals have this parasite). 

	          H17L  P25L  P41L  P4L
AUTINF	  39	     0	        0	   0
GLYSPI	  16	     2	       15	   0
FORCOL	  1	     0	       0	   27
HYLPOE	  3	     0	       2	   0
HYLNAE	  1	     4	       2	   0
MYRMYO  2	     5	       2	   0
THAARD	  0	     8	       0	   0

This is a list of host trait values for each of the hosts:
	          abundance	weight	survival
AUTINF	  488	        38	        0.48
GLYSPI	  827	        14.1   	0.59
FORCOL	  156	        44.3 	0.55
HYLPOE	  322	        17.5	        0.54
HYLNAE	  309	        14.5	        0.73
MYRMYO  475	        20.8	        0.59
THAARD	  429	        18.4  	0.67

And this is an estimate of host specificity of the parasites, incorporating
prevalence and phylogeny:

	Specificity
H17L	2.08
P25L	1.72
P41L	2.19
P4L	        0

I want to determine whether specificity of the parasites relates to any of
the host traits. For this, I would like to do a multiple regression. To
avoid psedureplication, I want to include a host species only once in the
matrix. So, for H17L, I could pick either of the hosts (except THAARD),
etc., but once a host is picked for one parasite, it cannot be picked for
another. For example, if I pick GLYSPI for H17L, GLYSPI has to be removed as
a choice for P25L and P41L. Thus, I also have to randomize which parasite
has its host picked first. In all cases, I want to lock FORCOL and P4L, so
FORCOL will not be an option for H17L anymore. This last part I’m still
uncertain about, I might just randomly pick hosts for all parasites and then
risk losing the strict host species specialists from some matrices. 

If I make 2 random selections I might end up with:
	        Random1	Random2	
H17L	AUTINF	        GLYSPI	        
P25L	GLYSPI	        HYLNAE	      
P41L	HYLPOE	        MYRMYO       
P4L	        FORCOL	        FORCOL	        

For the first random table I would then do a multiple regression on the
dependent specificity variable and independent host trait values:
Specificity	abundance	weight	survival
2.08	               488	                38	        0.48
1.72	               827	                14.1	        0.59
2.19	               322	                17.5 	0.54
0                     156	                44.3 	0.55

If I generate 1000 randomly selected host-parasite combinations, I would
have 1000 such tables, on which I would have to run 1000 independent
regressions. Since I’m using model selection and multimodel inference to
estimate parameter values, I will end up doing the model selection 1000
times. 

Your second suggestion makes most sense to me, but I don’t understand how to
implement it. Would you (or someone else) please give me some advise on
that? Also, once I have the 1000 random host-parasite matrices, how do I
link these to the tables of actual values (host traits and parasite
specificity)?

Thanks so much!
Maria

--
View this message in context: http://r.789695.n4.nabble.com/Random-resampling-of-columns-in-species-association-matrices-tp4620618p4623563.html
Sent from the R help mailing list archive at Nabble.com.