[R] CoDA: Count Zeros in Biological Data

Bert Gunter gunter.berton at gene.com
Fri Jun 20 04:37:09 CEST 2014


1. This is a question about statistical methodology, not R. Hence
inappropriate here.

2. Replies should therefore be private.

3. Consult the literature.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Thu, Jun 19, 2014 at 6:23 PM, Rich Shepard <rshepard at appl-ecosys.com> wrote:
>   I have several small biological count-based data sets with one or more
> rows having zero proportion. The other proportions in the row sum to 1.000
> (or 0.9999 in the sixth data row below because of rounding errors in the
> computer). An example is:
>
>    sampdate filter gather  graze predate  shred
>  2000-07-18 0.0550 0.5596 0.0734  0.2294 0.0826
>  2003-07-08 0.0734 0.6147 0.0183  0.2294 0.0642
>  2005-07-13 0.1161 0.5714 0.0357  0.1696 0.1071
>  2006-06-28 0.1000 0.4667 0.1500  0.1333 0.1500
>  2010-09-14 0.0778 0.6111 0.0444  0.1889 0.0778
>  2011-07-13 0.0879 0.5714 0.0659  0.2747 0.0000
>  2012-07-11 0.1042 0.5313 0.0625  0.2396 0.0625
>
>   My concern is that in most field-biological (ecological/environmental)
> data there can be two explanations for zero counts: the organism was not
> present on that date or it was present but not collected. There is no way to
> determine which case holds true in each instance, but the ecological
> interpretations differ.
>
>   The zCompositions package offers several methods of imputing a value to
> replace the zeros. As I'm completely new to compositional data analyses
> (CoDA) I would appreciate advice on how to select the most appropriate
> method for these data sets. The available methods are: Geometric Bayesian
> multiplicative, BM, (GBM, default); square root BM (SQ); Bayes-Laplace BM
> (BL); count zero multiplicative (CZM); user-specified hyper-parameters
> (user).
>
>   These biological data seem to me to be different from geochemical or
> economic data I see in package data sets or the CoDA references I've
> acquired and read.
>
>   Advice and suggestions (including references to application of CoDA to
> ecological/environmental data) will be appreciated.
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list