Maps

library(HydroBOT)
library(ggplot2)
library(dplyr)
library(sf)

Overview

This notebook provides examples of creating maps to display HydroBOT data and especially aggregated outcomes. Typically, these have quantitative fill that is spatially-relevant, whether that is a point (gauge) or polygon (SDL unit, basin, etc). The numerical values can be more difficult to ascertain precisely from maps, but they are often the clearest way to present spatial data, especially when the message is one of spatial variation. Because maps lose two dimensions for data (i.e x and y become position), we often have to reduce the number of categories we look at and rely heavily on facetting and subsetting.

As with all plots in {HydroBOT}, maps have the ability to use colour and different colour palettes to include additional information, including type of response. These settings are dealt with more completely in the bar_plots, with much of the mechanics the same for maps and lines e.g. the use of colorgroups and a list for pal_list.

With maps, we can also plot multiple layers in the foreground overlay and background underlay, though these have fewer options than the ‘primary’ layer (e.g. we cannot plot data as fill in multiple polygon layers).

Demonstration setup

As usual, we need paths to the data. We use the ‘more scenarios’ examples for all the plots, with processing as in the website workflow.

project_dir <- file.path("more_scenarios")
hydro_dir <- file.path(project_dir, "hydrographs")
agg_dir <- file.path(project_dir, "aggregator_output")

Read in the data

We read in the example data we will use for all plots.

agged_data <- readRDS(file.path(agg_dir, "achievement_aggregated.rds"))

That has all the steps in the aggregation, but most of the plots here will only use a subset to demonstrate.

To make visualisation easier, the SDL units data is given a grouping column that puts the many env_obj variables in groups defined by their first two letters, e.g. EF for Ecosystem Function. These correspond to the ‘Target’ level, but it can be useful to have the two groupings together for some examples.

If we had used multiple aggregation functions at any step, we should filter down to the one we want here, but we only used one for this example.

For simplicity here, we will only look at a small selection of the scenarios (multiplicative changes of 0.5,1, and 2). Thus, we make two small dataframes for our primary examples here.

scenarios_to_plot <- c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0")

scenarios <- yaml::read_yaml(file.path(hydro_dir, "scenario_metadata.yml")) |>
  tibble::as_tibble()

basin_to_plot <- agged_data$mdb |>
  dplyr::filter(scenario %in% scenarios_to_plot) |>
  dplyr::left_join(scenarios, by = "scenario")

# Create a grouping variable
obj_sdl_to_plot <- agged_data$sdl_units |>
  dplyr::filter(scenario %in% scenarios_to_plot) |>
  dplyr::mutate(env_group = stringr::str_extract(env_obj, "^[A-Z]+")) |>
  dplyr::arrange(env_group, env_obj) |>
  dplyr::left_join(scenarios, by = "scenario")

Standard scenario appearance

We will typically have a consistent look for the scenarios across the project, with a logical ordering and standard colours. Such standard colours are not included in the {HydroBOT} package because they are project/analysis- specific, but they could be set at project-level, e.g. in the .Rprofile, if desired.

Here, we use the special arguments refvals and refcols to make a colour palette from a standard {paletter} option (“ggsci::nrc_npg”) while setting a specific level to a specified value. We will use the codes defining the scenarios rather than the names to make plots readable.

There is a sceneorder argument to plot_outcomes() that lets us explicitly set the order of the scenarios. However, it is typically easiest to simply make the scenarios a factor, though we use the sceneorder argument here. It operates only on a column named ‘scenario’, though, so if other columns need to be ordered they should be made factors before feeding to plot_outcomes().

sceneorder <- forcats::fct_reorder(
  basin_to_plot$scenario,
  basin_to_plot$flow_multiplier
)

scene_pal <- make_pal(unique(basin_to_plot$climate_code),
  palette = "ggsci::nrc_npg",
  refvals = "E", refcols = "black"
)

scene_pal

<colors>
black #E64B35FF #4DBBD5FF

Make maps

We start with a simple map of polygon aggregations. Because maps lose two dimensions, we subset to the Waterbirds grouping to reduce dimensionality.

obj_sdl_to_plot |>
  dplyr::filter(env_group == "WB") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    y_lab = "Proportion EWR\nachieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "env_obj",
    facet_row = "climate_code"
  )

Simple background and foreground

We can add a polygon with no mapped aesthetics to the background and foreground using underlay_list and overlay_list. Here, we add the basin shape behind the data and the gauges used originally.

# this can take a while so use a small sheet and pre-filter to make smaller
relevant_gauges <- agged_data$ewr_code |>
  filter(scenario == "climatebaseadapt0") |>
  select(gauge, geometry) |>
  distinct()

obj_sdl_to_plot |>
  dplyr::filter(env_group == "WB") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    y_lab = "Proportion EWR\nachieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "env_obj",
    facet_row = "climate_code",
    underlay_list = list(
      underlay = basin,
      underlay_pal = "azure"
    ),
    overlay_list = list(
      overlay = relevant_gauges,
      overlay_pal = "firebrick"
    )
  )

Other spatial units

We used the sdl units (an intermediate aggregation level) above, but any geometric levels will work. First, we show with gauges as the focal spatial unit, e.g. before any spatial aggregation has occurred (but there has been both temporal and theme).

Important

If we attempt to run this as-is, it shows the value of plot_outcomes() for catching unintentional overplotting. The filtering to just NF1 is an attempt to only plot one value, but there are hidden and easily missed doubled gauges in this data. Because in the EWR tool some gauges inform several different planning units, there are more than one line for those gauges. plot_outcomes() catches this and forces us to deal with it.

agged_data$env_obj |> # for readability
  dplyr::filter(env_obj == "NF1" &
    scenario %in% scenarios_to_plot) |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    y_lab = "Arithmetic Mean",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "scenario",
    facet_row = "env_obj",
    underlay_list = list(
      underlay = "basin",
      underlay_pal = "azure"
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

Error in `test_overplotting()` at HydroBOT/R/plot_types.R:156:3:
! Trying to plot multiple values
(up to 4) in single polygons.
Something is duplicated-
do you need more facetting or filtering?

The ‘real’ fix of that issue would be to filter to a particular planning unit we wanted or investigate in more granular detail. But here, we just want a simple demonstration of plotting points, so we simply filter to the first record for each gauge.

env_obj_to_plot <- agged_data$env_obj |> # for readability
  dplyr::filter(env_obj == "NF1" &
    scenario %in% scenarios_to_plot) |> # Need to reduce dimensionality
  group_by(scenario, env_obj) |>
  filter(!duplicated(gauge)) |>
  ungroup()

env_obj_to_plot |>
  plot_outcomes(
    outcome_col = "ewr_achieved",
    y_lab = "Arithmetic Mean",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "scenario",
    facet_row = "env_obj",
    underlay_list = list(
      underlay = "basin",
      underlay_pal = "azure"
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

We can also plot the basin-scale outcomes (rather than just the shape as above).

basin_to_plot |>
  plot_outcomes(
    outcome_col = "ewr_achieved",
    y_lab = "Proportion EWR\nachieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "Target",
    facet_row = "climate_code"
  )

Multiple informative layers

In their most simple sense, underlays and overlays can be used to simply add spatial information (the basin or country, the gauge locations). In more complex cases though, information can be plotted in multiple layers. The choice of a ‘primary’ layer from which other layers are underlays or overlays can be a bit arbitrary, since each layer can plot outcome data, but the ‘primary’ layer has access to a few more processing options. It typically makes the most sense to choose a primary layer to address a targeted question and choose the under/overlays to add context.

Underlays

While we can’t have informative fill in multiple layers ({ggplot2} won’t allow multiple fills), we can have informative fill in polygons underlying point (gauge) data, because the points use color, not fill.

First, we include a fill in the underlay for SDL unit name (e.g. not a data fill, but still informative) by simply replacing the single colour in underlay_pal with a {paletteer} palette. It is hard to find palettes for the full set of catchments. We use a qualitative palette here. Using a continuous palette (e.g. grDevices::Purple-Yellow) works OK as well, but the colours aren’t very well spatially-separated.

env_obj_to_plot |>
  plot_outcomes(
    outcome_col = "ewr_achieved",
    y_lab = "All arithmetic mean",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = 1,
    facet_col = "scenario",
    facet_row = "env_obj",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    underlay_list = list(
      underlay = sdl_units,
      underlay_ycol = "SWSDLName",
      underlay_pal = "palettesForR::Muted"
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

We can use a continuous variable on the underlay fill, but have to be careful to choose palettes that don’t mask each other. It is very tempting to put e.g. gauges under sdl units and colour by the same palette to show the effect of the aggregation, but that is often confusing given the similarity in colour (and noting that because here we have to eliminate some planning units to plot the gauges, the full set of aggregated gauge values is not shown).

env_obj_to_plot |>
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::oslo"),
    facet_col = "scenario",
    facet_row = "env_obj",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    underlay_list = list(
      underlay = dplyr::filter(obj_sdl_to_plot, env_obj == "NF1"),
      underlay_ycol = "ewr_achieved",
      underlay_pal = "scico::oslo"
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

While the aggregation isn’t as obvious here, it is easier to see the different levels:

env_obj_to_plot |>
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("ggthemes::Orange-Gold"),
    facet_col = "scenario",
    facet_row = "env_obj",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    underlay_list = list(
      underlay = dplyr::filter(obj_sdl_to_plot, env_obj == "NF1"),
      underlay_ycol = "ewr_achieved",
      underlay_pal = "scico::oslo"
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

We can also have multiple levels of underlay polygons (e.g. if we want the basin under sdl units), and can use a fill palette in one of them (Figure 2) by using a list of lists for underlay_list. A similar approach also works for overlays (Figure 3). And as we’ve seen above (Figure 1), we can have both underlays and overlays.

env_obj_to_plot |>
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("ggthemes::Orange-Gold"),
    facet_col = "scenario",
    facet_row = "env_obj",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    underlay_list = list(
      list(
        underlay = "basin",
        underlay_pal = "cornsilk"
      ),
      list(
        underlay = dplyr::filter(obj_sdl_to_plot, env_obj == "NF1"),
        underlay_ycol = "ewr_achieved",
        underlay_pal = "scico::oslo"
      )
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

Overlays

Above, we used the gauge layer as ‘primary’, and included the sdl units or basin as underlays. We can also overlay, here with the sdl layer as ‘primary’.

We start with a single colour overlay to show where gauges are. This is the full set of BOM gauges, not just those that went into the aggregation (i.e. are in the EWR tool), which are the ones shown above (Figure 1).

Tip

We also show here the use of clip, which auto-clips the under or overlay data to the primary layer, and the use of just an {sf} object name for underlay_list = "basin" as shorthand when we don’t want to specify any visual properties.

obj_sdl_to_plot |>
  dplyr::filter(env_obj == "NF1") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "scenario",
    facet_row = "env_obj",
    underlay_list = "basin",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    overlay_list = list(
      overlay = "bom_basin_gauges",
      overlay_pal = "grey40",
      clip = TRUE
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

We can give the overlay informative colours- this outcome is similar to what we’ve done above, but the amount of control we have over scaling etc differs depending on whether a layer is ‘primary’. We also use map_outlinecolor to adjust the colour of the primary layer outlines.

obj_sdl_to_plot |>
  dplyr::filter(env_obj == "NF1") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::lapaz"),
    pal_direction = -1,
    facet_col = "scenario",
    facet_row = "env_obj",
    underlay_list = "basin",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    map_outlinecolor = "forestgreen",
    overlay_list = list(
      overlay = env_obj_to_plot,
      overlay_pal = "scico::oslo",
      overlay_ycol = "ewr_achieved",
      clip = TRUE
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

Like underlays (Figure 2), we can have multiple levels of overlays (Figure 3) by using a list of lists for overlay_list. And as we’ve seen above (Figure 1), we can have both underlays and overlays.

Here, we demonstrate a primary layer at the basin scale, overlay the sdl units with empty fill so we can see their locations, and then put gauges on with informative values.

basin_to_plot |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("scico::oslo"),
    facet_col = "Target",
    facet_row = "scenario",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    overlay_list = list(
      list(overlay = "sdl_units"),
      list(
        overlay = env_obj_to_plot,
        overlay_pal = "ggthemes::Orange-Gold",
        overlay_ycol = "ewr_achieved"
      )
    )
  ) +
  ggplot2::theme(legend.position = "bottom")

Baselining

As with the other plotting functions, we can compare to baseline using arbitrary functions, here difference to get the arithmetic change in outcomes. For both difference and relative, the limits are adjusted internally to center on the reference value (0 for difference, 1 for relative), and so using a diverging palette will make that centering clear.

obj_sdl_to_plot |>
  dplyr::filter(env_group == "WB") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorset = "ewr_achieved",
    pal_list = list("ggthemes::Orange-Blue-White Diverging"),
    facet_col = "env_obj",
    facet_row = "scenario",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    base_list = list(
      base_lev = "climatebaseadapt0",
      comp_fun = "difference",
      group_cols = c("env_obj", "polyID")
    ),
    underlay_list = list(underlay = basin, underlay_pal = "azure")
  )

And the relative change is likely to be the most informative and appropriate in many cases. We use the add_eps argument to add a small amount (half the minimum value) to zeros, otherwise we end up taking the log of 0.

obj_sdl_to_plot |>
  dplyr::filter(env_group == "WB") |> # Need to reduce dimensionality
  plot_outcomes(
    outcome_col = "ewr_achieved",
    plot_type = "map",
    colorgroups = NULL,
    colorset = "ewr_achieved",
    pal_list = list("ggthemes::Orange-Blue-White Diverging"),
    facet_col = "env_obj",
    facet_row = "scenario",
    sceneorder = c("climatedown2adapt0", "climatebaseadapt0", "climateup2adapt0"),
    base_list = list(
      base_lev = "climatebaseadapt0",
      comp_fun = "relative",
      group_cols = c("env_obj", "polyID"),
      add_eps = "auto"
    ),
    zero_adjust = "auto",
    transoutcome = "log10",
    underlay_list = list(underlay = basin, underlay_pal = "azure")
  )