Defining structure

In this vignette, we assume that the experimental aim is to find the best wheat variety from a wheat field trial.

library(edibble)

Initialisation

A new design constructed using edibble must start by initialising the design object. An optional title of the design may be provided as input. This information persists as metadata in the object and is displayed in various places (e.g., print output and exported files).

When you have no data, you start by simply initialising the design object.

design("Wheat field trial")

At this point, there is nothing particularly interesting. The design object requires the user to define the experimental factor(s) as described next.

Units

At minimum, the design requires units to be defined via set_units. In the code below, we initialise a new design object and then set a unit called “site” with 4 levels. The left hand side (LHS) and the right hand side (RHS) of the function input correspond to the factor name and the corresponding value, respectively. Here, the value is a single integer that denotes the number of levels of the factor. Note that the LHS can be any arbitrary (preferably syntactically valid) name. Selecting a name that succinctly describes the factor is recommended. Acronyms should be avoided where reasonable. We assign this design object to the variable called demo.

demo <- design("Demo for defining units") %>% 
  set_units(site = 4)

At this point, the design is in a graph form. The print of this object shows a prettified tree that displays the title of the experiment, the factors, and their corresponding number of levels. Notice the root in this tree output corresponds to the title given in the object initialisation.

demo
#> Demo for defining units
#> └─site (4 levels)

To obtain the design table, you must call on serve_table to signal that you wish the object to be transformed into the tabular form. The transformation for demo is shown below, where the output is a type of tibble with one column (the “site” factor), four rows (corresponding to the four levels in the site), and the entries corresponding to the actual levels of the factor (name derived as “site1”, “site2”, “site3”, and “site4” here). The first line of the print output is decorated with the title of the design object, which acts as a persistent reminder of the initial input. The row just under the header shows the role of the factor denoted by the upper case letter (here, U = unit) with the number of levels in that factor displayed. If the number of levels exceed a thousand, then the number is shown with an SI prefix rounded to the closest digit corresponding to the SI prefix form (e.g., 1000 is shown as 1k and 1800 is shown as ~2k). The row that follows shows the class of the factor (e.g., character or numeric).

serve_table(demo)
#> # Demo for defining units 
#> # An edibble: 4 x 1
#>     site
#>   <U(4)>
#>    <chr>
#> 1  site1
#> 2  site2
#> 3  site3
#> 4  site4

If particular names are desired for the levels, then the RHS value can be replaced with a vector like below where the levels are named “Narrabri”, “Horsham”, “Parkes” and “Roseworthy”.

design("Character vector input demo") %>% 
  set_units(site = c("Narrabri", "Horsham", "Parkes", "Roseworthy")) %>% 
  serve_table()
#> # Character vector input demo 
#> # An edibble: 4 x 1
#>         site
#>       <U(4)>
#>        <chr>
#> 1   Narrabri
#> 2    Horsham
#> 3     Parkes
#> 4 Roseworthy

The RHS value in theory be any vector. Below the input is a numeric vector, and the corresponding output will be a data.frame with a numeric column.

design("Numeric vector input demo") %>% 
  set_units(site = c(1, 2, 3, 4)) %>% 
  serve_table()
#> # Numeric vector input demo 
#> # An edibble: 4 x 1
#>     site
#>   <U(4)>
#>    <dbl>
#> 1      1
#> 2      2
#> 3      3
#> 4      4

In the instance that you do want to enter a single level with a numeric value, this can be specified using lvls on the RHS.

design("Single numeric level demo") %>% 
  set_units(site = lvls(4)) %>% 
  serve_table()
#> # Single numeric level demo 
#> # An edibble: 1 x 1
#>     site
#>   <U(1)>
#>    <dbl>
#> 1      4

Multiple units

We can add more unit factors to this study. Suppose that we have 72 plots. We append another call to set_units to encode this information.

demo2 <- demo %>% 
  set_units(plot = 72)

However, we did not defined the relationship between site and plot; so it fails to convert to the tabular form.

serve_table(demo2)
#> Error in `serve_table()`:
#> ! The graph cannot be converted to a table format.

The relationship between unit factors can be defined concurrently when defining the unit factors using helper functions. One of these helper functions is demonstrated next.

Nested units

Given that we have a wheat trial, we imagine that the site corresponds to the locations, and each location would have its own plots. The experimenter tells you that each site contains 18 plots. This nesting structure can be defined by using the helper function nested_in. With this relationship specified, the graph can be reconciled into a tabular format, as shown below.

demo %>% 
  set_units(plot = nested_in(site, 18)) %>% 
  serve_table()
#> # Demo for defining units 
#> # An edibble: 72 x 2
#>      site    plot
#>    <U(4)> <U(72)>
#>     <chr>   <chr>
#>  1  site1  plot01
#>  2  site1  plot02
#>  3  site1  plot03
#>  4  site1  plot04
#>  5  site1  plot05
#>  6  site1  plot06
#>  7  site1  plot07
#>  8  site1  plot08
#>  9  site1  plot09
#> 10  site1  plot10
#> # ℹ 62 more rows

In the above situation, the relationship between unit factors have to be apriori known, but there are situations in which the relationship may become cognizant only after defining the unit factors. In these situations, users can define the relationships using the functions allot_units and assign_units to add the edges between the relevant unit nodes in the factor and level graphs, respectively.

demo2 %>% 
  allot_units(site ~ plot) %>% 
  assign_units(order = "systematic-fastest") %>% 
  serve_table()
#> # Demo for defining units 
#> # An edibble: 72 x 2
#>      site    plot
#>    <U(4)> <U(72)>
#>     <chr>   <chr>
#>  1  site1  plot01
#>  2  site2  plot02
#>  3  site3  plot03
#>  4  site4  plot04
#>  5  site1  plot05
#>  6  site2  plot06
#>  7  site3  plot07
#>  8  site4  plot08
#>  9  site1  plot09
#> 10  site2  plot10
#> # ℹ 62 more rows

The code above specifies the nested relationship of plot to site, with the assignment of levels performed systematically. The systematic allocation of site levels to plot is done so that the site levels vary the fastest, which is not the same systematic ordering as before. If the same result as before is desirable, users can define order = "systematic-slowest", which offers a systematic assignment where the same levels are close together.

Crossed units

Crop field trials are often laid out in rectangular arrays. The experimenter confirms this by alerting to us that each site has plots laid out in a rectangular array with 6 rows and 3 columns. We can define crossing structures using crossed_by.

design("Crossed experiment") %>% 
  set_units(row = 6,
            col = 3,
            plot = crossed_by(row, col)) %>% 
  serve_table()
#> # Crossed experiment 
#> # An edibble: 18 x 3
#>       row    col    plot
#>    <U(6)> <U(3)> <U(18)>
#>     <chr>  <chr>   <chr>
#>  1   row1   col1  plot01
#>  2   row2   col1  plot02
#>  3   row3   col1  plot03
#>  4   row4   col1  plot04
#>  5   row5   col1  plot05
#>  6   row6   col1  plot06
#>  7   row1   col2  plot07
#>  8   row2   col2  plot08
#>  9   row3   col2  plot09
#> 10   row4   col2  plot10
#> 11   row5   col2  plot11
#> 12   row6   col2  plot12
#> 13   row1   col3  plot13
#> 14   row2   col3  plot14
#> 15   row3   col3  plot15
#> 16   row4   col3  plot16
#> 17   row5   col3  plot17
#> 18   row6   col3  plot18

The above table does not contain information on the site. For this, we need to combine the nesting and crossing structures, as shown next.

Complex unit structures

Now, suppose that there are four sites (Narrabri, Horsham, Parkes, and Roseworthy), and the 18 plots at each site are laid out in a rectangular array of 3 rows and 6 columns. We begin by specifying the site (the highest hierarchy in this structure). The dimensions of the rows and columns are specified for each site (3 rows and 6 columns). The plot is a result of crossing the row and column within each site.

complex <- design("Complex structure") %>% 
  set_units(site = c("Narrabri", "Horsham", "Parkes", "Roseworthy"),
            col = nested_in(site, 6),
            row = nested_in(site, 3),
            plot = nested_in(site, crossed_by(row, col))) 

serve_table(complex)
#> # Complex structure 
#> # An edibble: 72 x 4
#>        site     col     row    plot
#>      <U(4)> <U(24)> <U(12)> <U(72)>
#>       <chr>   <chr>   <chr>   <chr>
#>  1 Narrabri   col01   row01  plot01
#>  2 Narrabri   col01   row02  plot02
#>  3 Narrabri   col01   row03  plot03
#>  4 Narrabri   col02   row01  plot04
#>  5 Narrabri   col02   row02  plot05
#>  6 Narrabri   col02   row03  plot06
#>  7 Narrabri   col03   row01  plot07
#>  8 Narrabri   col03   row02  plot08
#>  9 Narrabri   col03   row03  plot09
#> 10 Narrabri   col04   row01  plot10
#> 11 Narrabri   col04   row02  plot11
#> 12 Narrabri   col04   row03  plot12
#> 13 Narrabri   col05   row01  plot13
#> 14 Narrabri   col05   row02  plot14
#> 15 Narrabri   col05   row03  plot15
#> 16 Narrabri   col06   row01  plot16
#> 17 Narrabri   col06   row02  plot17
#> 18 Narrabri   col06   row03  plot18
#> 19  Horsham   col07   row04  plot19
#> 20  Horsham   col07   row05  plot20
#> # ℹ 52 more rows

You may realise that the labels for the rows do not start with “row1” for Horsham. The default output displays distinct labels for the unit levels that are actually distinct. This safeguards for instances where the relationship between factors is lost, and the analyst will have to guess what units may be nested or crossed. However, nested labels may still be desirable. You can select the factors to show the nested labels by naming these factors as arguments for the label_nested in serve_table (below shows the nesting labels for row and col – notice plot still shows the distinct labels).

serve_table(complex, label_nested = c(row, col))
#> # Complex structure 
#> # An edibble: 72 x 4
#>        site     col     row    plot
#>      <U(4)> <U(24)> <U(12)> <U(72)>
#>       <chr>   <chr>   <chr>   <chr>
#>  1 Narrabri    col1    row1  plot01
#>  2 Narrabri    col1    row2  plot02
#>  3 Narrabri    col1    row3  plot03
#>  4 Narrabri    col2    row1  plot04
#>  5 Narrabri    col2    row2  plot05
#>  6 Narrabri    col2    row3  plot06
#>  7 Narrabri    col3    row1  plot07
#>  8 Narrabri    col3    row2  plot08
#>  9 Narrabri    col3    row3  plot09
#> 10 Narrabri    col4    row1  plot10
#> 11 Narrabri    col4    row2  plot11
#> 12 Narrabri    col4    row3  plot12
#> 13 Narrabri    col5    row1  plot13
#> 14 Narrabri    col5    row2  plot14
#> 15 Narrabri    col5    row3  plot15
#> 16 Narrabri    col6    row1  plot16
#> 17 Narrabri    col6    row2  plot17
#> 18 Narrabri    col6    row3  plot18
#> 19  Horsham    col1    row1  plot19
#> 20  Horsham    col1    row2  plot20
#> # ℹ 52 more rows

You later find that the dimensions of Narrabri and Roseworthy are larger. The experimenter tells you that there are in fact 9 columns available, and therefore 27 plots at Narrabri and Roseworthy. The number of columns can be modified according to each site, as below, where col is defined to have 9 levels at Narrabri and Roseworthy but 6 levels elsewhere.

complexd <- design("Complex structure with different dimensions") %>% 
  set_units(site = c("Narrabri", "Horsham", "Parkes", "Roseworthy"),
             col = nested_in(site, 
                      c("Narrabri", "Roseworthy") ~ 9,
                                                . ~ 6),
             row = nested_in(site, 3),
            plot = nested_in(site, crossed_by(row, col))) 

complextab <- serve_table(complexd, label_nested = everything())
table(complextab$site)
#> 
#>    Horsham   Narrabri     Parkes Roseworthy 
#>         18         27         18         27

You can see above that there are indeed nine additional plots at Narrabri and Roseworthy. The argument for label_nested supports tidyselect approach for selecting factors.

Treatments

Defining treatment factors is only necessary when designing a comparative experiment. The treatment factors can be set similar to the unit factors using set_trts. Below, we define an experiment with three treatment factors: variety (a or b), fertilizer (A or B), and amount of fertilizer (0.5, 1, or 2 t/ha).

factrt <- design("Factorial treatment") %>% 
  set_trts(variety = c("a", "b"),
           fertilizer = c("A", "B"),
           amount = c(0.5, 1, 2)) 

The links between treatment factors need not be explicitly defined. It is automatically assumed that treatment factors are crossed (i.e., the resulting treatment is the combination of all treatment factors) with the full set of treatments shown via trts_table. For the above experiment, there are a total of 12 treatments with the levels given below.

trts_table(factrt)
#> # A tibble: 12 × 3
#>    variety fertilizer amount
#>    <chr>   <chr>       <dbl>
#>  1 a       A             0.5
#>  2 b       A             0.5
#>  3 a       B             0.5
#>  4 b       B             0.5
#>  5 a       A             1  
#>  6 b       A             1  
#>  7 a       B             1  
#>  8 b       B             1  
#>  9 a       A             2  
#> 10 b       A             2  
#> 11 a       B             2  
#> 12 b       B             2

The factrt cannot be served as an edbl_table object, since there are no units defined in this experiment and how these treatments are administered to the units.

Conditional treatments

In some experiments, certain treatment factors are dependent on another treatment factor. A common example is when the dose or amount of a treatment factor is also a treatment factor. In the field trial example, we can have a case in which we administer no fertilizer to a plot. In this case, there is no point crossing with different amounts; in fact, the amount of no fertilizer should always be 0. We can specify this conditional treatment structure by describing this relationship using the helper function, conditioned_on, as below. The “.” in the LHS is a shorthand to mean all levels, except for those specified previously.

factrtc <- design("Factorial treatment with control") %>% 
  set_trts(variety = c("a", "b"),
           fertilizer = c("none", "A", "B"),
           amount = conditioned_on(fertilizer,
                                    "none" ~ 0,
                                         . ~ c(0.5, 1, 2)))

We can see below that the variety is crossed with other factors, as expected, but the amount is conditional on the fertilizer.

trts_table(factrtc)
#> # A tibble: 14 × 3
#>    variety fertilizer amount
#>    <chr>   <chr>       <dbl>
#>  1 a       none          0  
#>  2 b       none          0  
#>  3 a       A             0.5
#>  4 b       A             0.5
#>  5 a       A             1  
#>  6 b       A             1  
#>  7 a       A             2  
#>  8 b       A             2  
#>  9 a       B             0.5
#> 10 b       B             0.5
#> 11 a       B             1  
#> 12 b       B             1  
#> 13 a       B             2  
#> 14 b       B             2