library(werptoolkitr)
library(sf)
Run full toolkit in memory
This document provides a template for running through the toolkit in a single document, retaining everything in-memory (no intermediate saving). Intermediate saving is a very simple flip of a switch, demoed in its own doc.
Load the package
Structure
To run the toolkit, we need to provide paths to directories for input data and output data, as well as arguments for the aggregation.
One option is to do that in a parameters file, and then treat this as a parameterised notebook.
The other option is to have this be the parameterising file, so we can have a bit more text around the parameterisations. Not sure which makes more sense, but they’re not mutually exclusive, and the answer likely depends on whether we’re working interactively or want to fire off 1,000 runs.
Parameters
Directories
Input and output directories
Use the scenario_example/
directory created to capture a very simple demonstration case of 46 gauges in three catchments for 10 years.
Normally scenario_dir
should point somewhere external (though keeping it inside or alongside the hydrograph data is a good idea.). But here, I’m generating test data, so I’m keeping it in the repo. I will probably change that when I move to Azure.
# Outer directory for scenario
= file.path('more_scenarios')
project_dir
# Preexisting data
# Hydrographs (expected to exist already)
= file.path(project_dir, 'hydrographs')
hydro_dir
# Generated data
# EWR outputs (will be created here in controller, read from here in aggregator)
<- file.path(project_dir, 'module_output', 'EWR')
ewr_results
# outputs of aggregator. There may be multiple modules
<- file.path(project_dir, 'aggregator_output') agg_results
Controller
We use the default IQQM model format and climate categorisations, though those could be passed here as well (see controller).
Control output and return
To determine what to save and what to return to the active session, use outputType
and returnType
, respectively. Each of them can take a list of any of 'none'
, 'summary'
, 'annual'
, 'all'
. For this demonstration I’ll not save anything and return summary
to the active session.
<- list('summary')
outputType <- list('summary') # list('summary', 'all') returnType
Aggregator
To keep this simple, we use one aggregation list and the read_and_agg
wrapper to only have to pass paths. See the more detailed documents for the different ways to specify those aggregation lists.
What to aggregate
The aggregator needs to know which set of EWR outputs to use (to navigate the directory or list structure). It should accept multiple types, but that’s not well tested, so for now just use one.
<- 'summary' aggType
We need to tell it the variable to aggregate, and any grouping variables other than the themes and spatial groups. Typically, scenario
will be a grouper, but there may be others.
<- 'scenario'
agg_groups <- 'ewr_achieved' agg_var
Do we want it to return to the active session? For this demo, I’m keeping everything in the session, so set to TRUE
.
<- TRUE aggReturn
How to aggregate
Fundamentally, the aggregator needs paths and two lists
sequence of aggregations
sequence of aggregation functions (can be multiple per step)
Here, I’m using an interleaved list of theme and spatial aggregations (see the detailed docs for more explanation), and applying only a single aggregation function at each step for simplicity. Those steps are specified a range of different ways to give a small taste of the flexibility here, but see the spatial and theme docs for more examples.
<- list(ewr_code = c('ewr_code_timing', 'ewr_code'),
aggseq env_obj = c('ewr_code', "env_obj"),
resource_plan = resource_plan_areas,
Specific_goal = c('env_obj', "Specific_goal"),
catchment = cewo_valleys,
Objective = c('Specific_goal', 'Objective'),
mdb = basin,
target_5_year_2024 = c('Objective', 'target_5_year_2024'))
<- list(c('CompensatingFactor'),
funseq c('ArithmeticMean'),
c('ArithmeticMean'),
c('ArithmeticMean'),
::quo(list(wm = ~weighted.mean(., w = area,
rlangna.rm = TRUE))),
c('ArithmeticMean'),
::quo(list(wm = ~weighted.mean(., w = area,
rlangna.rm = TRUE))),
c('ArithmeticMean'))
Run the toolkit
Controller
This is not actually run here for speed- the same thing is done in a notebook for the full toolkit saving steps.
<- prep_run_save_ewrs_R(scenario_dir = hydro_dir,
ewr_out output_dir = project_dir,
outputType = outputType,
returnType = returnType)
Aggregator
Because the chunk above is not run, the needed EWR outputs are not available, but would be if it were run.
<- read_and_agg(datpath = ewr_results,
aggout type = aggType,
geopath = bom_basin_gauges,
causalpath = causal_ewr,
groupers = agg_groups,
aggCols = agg_var,
aggsequence = aggseq,
funsequence = funseq,
saveintermediate = TRUE,
namehistory = FALSE,
keepAllPolys = TRUE,
returnList = aggReturn,
savepath = agg_results)
Quick check
Plotting will be developed in the comparer, this is just a quick check to see that there is data. Code borrowed from theme x space demo.
But only if we returned the output- since the chunks above are not run, we skip this as well.
if (aggReturn) {
# Scenario data
<- jsonlite::read_json(file.path(hydro_dir,
scenarios 'scenario_metadata.json')) |>
::as_tibble() |>
tibble::unnest(cols = everything())
tidyr
# plot
$catchment %>%
aggout::filter(Specific_goal == 'All recorded fish species') %>%
dplyrleft_join(scenarios, by = c('scenario' = 'scenario_name')) %>%
::ggplot() +
ggplot2::geom_sf(data = basin) +
ggplot2::geom_sf(ggplot2::aes(fill = ewr_achieved)) +
ggplot2::facet_grid(.~forcats::fct_reorder(scenario, delta))
ggplot2 }