Run full workflow in memory

This document provides a template for running through HydroBOT in a single document, retaining everything in-memory (no intermediate saving). In general, given the time the EWR tool (and sometimes aggregation) take to run, this approach would only be taken with small sets of input data.

Intermediate saving is a very simple flip of a switch, demoed in its own doc, which parallels this one nearly exactly with the exception of specifying an outputType in prep_run_save_ewrs() and a savepath in read_and_agg(). Saving is generally how workflows are run for large projects, and so that example is also used for the getting started demonstration.

Load the package

library(HydroBOT)

Set parameters

Directories

Input and output directories

Here, we will use the example hydrographs that come with HydroBOT.

Normally project_dir and hydro_dir would point somewhere external (and typically, having hydro_dir inside project_dir will make life easier).

hydro_dir <- system.file("extdata/testsmall/hydrographs", package = "HydroBOT")

project_dir <- file.path("test_dir")

# Generated data
# EWR outputs (will be created here in controller, read from here in aggregator)
ewr_results <- file.path(project_dir, "module_output", "EWR")

# outputs of aggregator. There may be multiple modules
agg_results <- file.path(project_dir, "aggregator_output")

Controller

Here, we use a simple set of default arguments, see controller for detailed treatments of arguments and their meaning.

Control output and return

To determine what to save and what to return to the active session, use outputType and returnType, respectively. Each of them can take a list of any of the EWR output options (see ?prep_run_save_ewrs()). For this demonstration I’ll not save anything and return yearly to the active session.

outputType <- list("none")
returnType <- list("yearly")

Aggregator

To keep this simple, we use one aggregation list and the read_and_agg wrapper to only have to pass paths. See the more detailed documents for the different ways to specify those aggregation lists.

What to aggregate

We need to tell it the variable to aggregate, and any grouping variables other than time, themes, and spatial groups. Typically, scenario will be a grouper, but there may be others.

agg_var <- "ewr_achieved"
agg_groups <- "scenario"

How to aggregate

Fundamentally, the aggregator needs paths and two lists

sequence of aggregations
sequence of aggregation functions (can be multiple per step)

Here, I’m using an interleaved list of theme and spatial aggregations (see the detailed docs for more explanation), and applying only a single aggregation function at each step for simplicity. Those steps can be specified a range of different ways, see the spatial and theme docs for more examples.

aggseq <- list(
  all_time = "all_time",
  ewr_code = c("ewr_code_timing", "ewr_code"),
  sdl_units = sdl_units,
  env_obj = c("ewr_code", "env_obj"),
  Target = c("env_obj", "Target"),
  mdb = basin,
  target_5_year_2024 = c("Target", "target_5_year_2024")
)

funseq <- list(
  "ArithmeticMean",
  "CompensatingFactor",
  "ArithmeticMean",
  "ArithmeticMean",
  "ArithmeticMean",
  "SpatialWeightedMean",
  "ArithmeticMean"
)

Run HydroBOT

Controller

This is not actually run here for speed.

ewr_out <- prep_run_save_ewrs(
  hydro_dir = hydro_dir,
  output_parent_dir = project_dir,
  outputType = outputType,
  returnType = returnType
)

Aggregator

Because the chunk above is not run, the needed EWR outputs are not available, but would be if it were run.

aggout <- read_and_agg(
  datpath = ewr_out,
  type = "achievement",
  geopath = bom_basin_gauges,
  causalpath = causal_ewr,
  groupers = "scenario",
  aggCols = "ewr_achieved",
  auto_ewr_PU = TRUE,
  aggsequence = aggseq,
  funsequence = funseq,
  saveintermediate = TRUE,
  namehistory = FALSE,
  keepAllPolys = FALSE,
  returnList = TRUE,
  add_max = FALSE,
  savepath = NULL
)

ℹ EWR outputs auto-grouped
• Done automatically because `auto_ewr_PU = TRUE`
• EWRs should be grouped by `SWSDLName`, `planning_unit_name`, and `gauge` until aggregated to larger spatial areas.
• Rows will collapse otherwise, silently aggregating over the wrong dimension
• Best to explicitly use `group_until` in `multi_aggregate()` or `read_and_agg()`.

ℹ EWR outputs auto-grouped
• Done automatically because `auto_ewr_PU = TRUE`
• EWRs should be grouped by `SWSDLName`, `planning_unit_name`, and `gauge` until aggregated to larger spatial areas.
• Rows will collapse otherwise, silently aggregating over the wrong dimension
• Best to explicitly use `group_until` in `multi_aggregate()` or `read_and_agg()`
.
ℹ EWR gauges joined to larger units pseudo-spatially.
• Done automatically because `auto_ewr_PU = TRUE`
• Non-spatial join needed because gauges may inform areas they are not within
• Best to explicitly use `pseudo_spatial = 'sdl_units'` in `multi_aggregate()` or `read_and_agg()`.

! Unmatched links in causal network
• 4 from Target to target_5_year_2024

Quick check

Plotting is considered in detail in the [comparer](../comparer/comparer_overview(), this is just a quick check to see that there is data. Code borrowed from theme x space demo.

But only if we returned the output- since the chunks above are not run, we skip this as well.

map_example <- aggout$env_obj |>
  dplyr::filter(env_obj == "NF1") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "scenario",
    facet_row = "env_obj",
    sceneorder = c("down4", "base", "up4"),
    underlay_list = "basin"
  ) +
  ggplot2::theme(legend.position = "bottom")

map_example