[R] NADA Package: Referencing Data Frame Columns

MacQueen, Don macqueen1 at llnl.gov
Wed Aug 8 22:08:16 CEST 2012


Hi Rich,

I may not have the complete picture here, but I do see what looks to me
like a problem with your chem.cast.

Specifically, since it has only a single detection indicator column
(ceneq1), it implies that within any single sample either all the analytes
were detected, or all were not. Not what I would expect.

If the typo that others pointed out was not the entire answer to your
question, then I would add:

As to your larger question of which layout is appropriate for use with
NADA functions, the answer is that either can be used. The "trick" is to
use the appropriate syntax to extract the values needed to pass the data
to a NADA function. The syntax is different for the long vs the wide
format. At this point, it's not really a NADA issue, just a matter of R
syntax. There are multiple ways to do either one. I suppose each has pros
and cons, to some extent depends on what kinds of graphics or analyses you
need to do, and there's plenty of room for personal preference.


For the long format you subset the rows, then pass the appropriate
columns. Here's one way:

   with(subset(chem, param=='AgDis') , ros(quant,ceneq1))


For the wide format you pass the appropriate columns

   ros( chem.cast$AgDis, chem.cast$AgDis.ceneq1 )

where I have invented the name of a new column that has the censoring
indicator specific to AgDis.

Hope this helps.

-Don

(p.s., I still think you'll be better off in the long run if you store
site, param, and maybe era, as character objects, not factors.)

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 8/7/12 9:26 AM, "Rich Shepard" <rshepard at appl-ecosys.com> wrote:

>   The sample data sets that come with the NADA package are limited to
>one or
>two variables and a censored measurement indicator column. I try to mimic
>examples using my data but keep missing the target.
>
>   My water chemistry data is available in two formats: long (as seen in a
>database table) and wide (as seen in a spreadsheet). The two structures
>are:
>
>str(chem)
>'data.frame':	65349 obs. of  8 variables:
>  $ site    : Factor w/ 64 levels "D-1","D-2","D-3",..: 1 1 1 1 1 1 1 ...
>  $ sampdate: Date, format: "2007-12-12" "2007-12-12" ...
>  $ era     : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1 1 1 1 ...
>  $ param   : Factor w/ 64 levels "AgDis","AgTot",..: 2 4 5 7 11 15 25 ...
>  $ quant   : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
>  $ ceneq1  : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
>  $ floor   : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
>  $ ceiling : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 2.39e-02
>...
>
>and
>
>str(chem.cast)
>'data.frame':	56938 obs. of  70 variables:
>  $ site     : Factor w/ 64 levels "D-1","D-2","D-3",..: 1 1 1 1 1 ...
>  $ sampdate : Date, format: "2007-12-12" "2007-12-12" ...
>  $ era      : Factor w/ 2 levels "Post","Pre": 1 1 1 1 1 1 1 1 1 1 ...
>  $ ceneq1   : logi  TRUE FALSE FALSE FALSE TRUE FALSE ...
>  $ floor    : num  0 0.106 231 0.0113 0 100 0 1.43 0 0.0239 ...
>  $ ceiling  : num  1.30e-04 1.06e-01 2.31e+02 1.13e-02 5.00e-03 ...
>  $ AgDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ AgTot    : num  0.00013 NA NA NA NA NA NA NA NA NA ...
>  $ AlDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ AlTot    : num  NA 0.106 NA NA NA NA NA NA NA NA ...
>  $ Alk      : num  NA NA 231 NA NA NA NA NA NA NA ...
>  $ AsDis    : num  NA NA NA NA NA NA NA NA NA NA ...
>   and so on.
>
>   I do not know if the latter is appropriate; that is, that the ceneq1,
>floor, and ceiling values are available for each site, sampdate, and
>chemical.
>
>   Is the appropriate way to use the NADA methods for analyses and
>plotting
>to subset each chemical separately from the 'chem' data frame? Or, is
>there
>a syntax other than, for example,
>
>cenboxplot(chem&Vdis, chem$ceneq1, chem$era)
>Error in cenros(obs[group == i], cen[group == i]) :
>   error in evaluating the argument 'obs' in selecting a method for
>function
>'ros': Error: object 'Vdis' not found
>
>   I get the same error when trying to use the 'chem.cast' data frame.
>
>Rich
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list