Nick logo Credibly Curious

Nick Tierney's (mostly) rstats blog

2024-05-27

{geotargets} 0.1.0

Nicholas Tierney

Categories: geospatial rstats targets packages Tags: geospatial rstats targets packages

4 minute read

I’m very happy to announce {geotargets} version 0.1.0! The {geotargets} package extends {targets} to work with geospatial data formats. Version 0.1.0 supports terra::vect(), terra::rast() and terra::sprc() formats. This R package is only possible due to the great work by Eric Scott and Andrew Brown. While this blog post is on my website, I want to emphasise that this project is very much a team effort.

You can download {geotargets} from the R universe like so:

install.packages("geotargets", repos = c("https://njtierney.r-universe.dev", "https://cran.r-project.org"))

What is targets? Why do I need geotargets?

The targets package is an R package for managing analytic pipelines. It means that you can write out an analysis in a specific manner, and then as you update code, it will only rerun the necessary parts. Essentially it helps you avoid running large pieces of analysis when you don’t need to. To learn more about targets, I’d highly recommend reading the {targets} manual.

Let’s show an example. Let’s say we want to get an example raster file from {terra}, we can do the following:

terra_rast_example <- system.file(
  "ex/elev.tif", 
  package = "terra"
  ) |>
  terra::rast()

terra_rast_example
#> class       : SpatRaster 
#> dimensions  : 90, 95, 1  (nrow, ncol, nlyr)
#> resolution  : 0.008333333, 0.008333333  (x, y)
#> extent      : 5.741667, 6.533333, 49.44167, 50.19167  (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326) 
#> source      : elev.tif 
#> name        : elevation 
#> min value   :       141 
#> max value   :       547

Here is the equivalent code in a targets pipeline - the reason we want to use {targets} here is we save the results so we don’t need to run them again. In this case the example code doesn’t take long to run. But imagine reading in the raster was hugely time and computer expensive and we didn’t want to do it again. The {targets} package stores the information so we can just read it back in later, and if we try and run the code again it will not update the code unless the data input has changed. Neat, right?

library(targets)
tar_dir({ # tar_dir() runs code from a temporary directory.
  tar_script({
    library(targets)
    list(
      tar_target(
        terra_rast_example,
        system.file("ex/elev.tif", package = "terra") |> terra::rast()
      )
    )
  })
  tar_make()
  x <- tar_read(terra_rast_example)
  x
})
#> ▶ dispatched target terra_rast_example
#> ● completed target terra_rast_example [1.196 seconds]
#> ▶ ended pipeline [1.825 seconds]
#> 
#> class       : SpatRaster
#> Error: external pointer is not valid

We get an error!

Error: external pointer is not valid

This is a relatively common gotcha moment when using libraries like {terra}. This is due to limitations with its underlying C++ implementation. There are specific ways to write and read these objects. See ?terra for details.

But how do we use {geotargets} to help with this? It helps handle these write and read steps, so you don’t have to worry about them and can use targets as you are used to.

So instead of tar_target(), you use tar_terra_rast() to save a {terra} raster:

library(targets)
tar_dir({ # tar_dir() runs code from a temporary directory.
  tar_script({
    library(targets)
    library(geotargets)
    list(
      tar_terra_rast(
        terra_rast_example,
        system.file("ex/elev.tif", package = "terra") |> terra::rast()
      )
    )
  })
  tar_make()
  x <- tar_read(terra_rast_example)
  x
})
#> ▶ dispatched target terra_rast_example
#> ● completed target terra_rast_example [0.006 seconds]
#> ▶ ended pipeline [0.061 seconds]
#> 
#> class       : SpatRaster 
#> dimensions  : 90, 95, 1  (nrow, ncol, nlyr)
#> resolution  : 0.008333333, 0.008333333  (x, y)
#> extent      : 5.741667, 6.533333, 49.44167, 50.19167  (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326) 
#> source      : terra_rast_example 
#> name        : elevation 
#> min value   :       141 
#> max value   :       547

Similarly, there are tar_terra_vect() and tar_terra_sprc() for dealing with vector (shapefile) and sprc (collections of rasters). See the README example for more information.

If you’d like to see these functions being used in a more practical context, see the demo-geotargets repository.

What’s next?

We are actively developing {geotargets}, and the next release will focus on adding support for splitting rasters into tiles, preserving SpatRaster metadata, and adding support for {stars}. You can see the full list of issues for more detail on what we are working on.

Thanks

We have recently generously received support from the R Consortium for our project, "{geotargets}: Enabling geospatial workflow management with {targets}", and so we would like to thank them for their support.

I’d also like to thank Michael Sumner, Anthony North, and Miles McBain for their helpful discussions, as well as Will Landau for writing targets, and being incredibly responsive and helpful to the issues and questions we have asked as we wrote {geotargets}.