2. Creating landscapes (gen3sis input)

02.07.2021

This vignette shows how to create input landscape objects necessary to run gen3sis. These objects are:

To create these files, we use the function create_input_landscape which requires as arguments:

Below, we will generate the R objects required by the function create_input_landscape step by step. For this exercise, we will use the landscape of South America that we used in the introduction vignette and is a small piece of the global landscape used at the case study of the main package manuscript1. The landscape data as well as the input rasters are available at the simulation repository. Download these and define the correct path to it in order to continue.

First, load the necessary packages and set the working directory:

# first we load the required packages
library(gen3sis)
library(raster)

# next we set the working directory to the downloaded data
datapath <- "data_from_project_gen3sis_simulations"
#> Warning: package 'raster' was built under R version 4.0.5
#> Warning: package 'sp' was built under R version 4.0.5

Creating the environmental rasters bricks

Second, create the rasterbricks containing the spatio-temporal distribution of the environmental variables of interest. Here, we will use the temperature, aridity and area rasters of the 1d example of South America that has been used in the introduction vignette. We first load the three corresponding raster bricks from the hard drive.

setwd(datapath)
temperature_brick <- brick("InputRasters/SouthAmerica/temperature_rasters.grd")
aridity_brick <- brick("InputRasters/SouthAmerica/aridity_rasters.grd")
area_brick <- brick("InputRasters/SouthAmerica/area_rasters.grd")

If you want to see an example on how you can create a virtual dynamic landscape from scratch, please refer to the design_landscape vignette.

Creating a list with the environmental raster bricks

To create the input landscape for gen3sis, all environmental variables have to be stored as raster files in a named list. It is important to add the name of the environmental variable you want to use in the list, which will be used when specifying the config file. Alternatively, you can also create a list which contains all of the raster file paths on your hard drive.

We will now create a list that contains all of the layers of our raster bricks for temperature, aridity and area.

landscapes_list <- list(temp = NULL, arid = NULL, area = NULL)
for (i in 1:nlayers(temperature_brick)) {
    landscapes_list$temp <- c(landscapes_list$temp, temperature_brick[[i]])
    landscapes_list$arid <- c(landscapes_list$arid, aridity_brick[[i]])
    landscapes_list$area <- c(landscapes_list$area, area_brick[[i]])
}

Defining a cost function

Additionally, we need to define a cost function which defines the connection cost between sites. The species dispersal ability will determine if sites are reachable (i.e. if the dispersal event is higher than the connection cost).

First, we will look at a very simple case, where no additional cost is added to the geographical distances, meaning that connections costs equal distance in grid cell units multiplied by the raster resolution. For landscapes with a coordinate system (crs) the cost is defined in meters, as defined by the coordinate system. By dividing the cost function by 1000 we get the cost of traveling for 1 km. We can define a simple cost function in which the dispersal is not penalized:

cost_function_null <- function(source, habitable_src, dest, habitable_dest) {
    return(1/1000)
}
This figure shows the connection costs from one site in the middle of South America to all other sites. To travel to the site of Antarctica that is indicated with the arrow, the travelling cost is 3790. The distance matrix was computed using the very simple cost function that has been introduced before and is not adding any penalty.

This figure shows the connection costs from one site in the middle of South America to all other sites. To travel to the site of Antarctica that is indicated with the arrow, the travelling cost is 3790. The distance matrix was computed using the very simple cost function that has been introduced before and is not adding any penalty.

For a slightly more sophisticated cost function, where only sites which have environmental values (no NAs) are considered habitable and suitable terrestrial sites should have a cost penalty of one (x1), but dispersal over water sites should double (x2) the cost of dispersal.

cost_function_water <- function(source, habitable_src, dest, habitable_dest) {
    if (!all(habitable_src, habitable_dest)) {
        return(2/1000)
    } else {
        return(1/1000)
    }
}
By using a cost function which penalises the crossing of water, the cost of travelling from our cell in the middle of South America to Antarctica increases to 5204.

By using a cost function which penalises the crossing of water, the cost of travelling from our cell in the middle of South America to Antarctica increases to 5204.

Whether a specific site can be reached or not depends on the properties of the population or species.

Further useful arguments

Besides the (i) landscape list, (ii) cost function and (iii) experiment folder, we can define further arguments such as: directions, time-steps and calculate_full_distance_matrices.

The argument ‘directions’ requires an integer providing the amount of directions used to calculate distances. It can take 4, 8 or 16 as values to consider the four, eighth or 16 neighboring sites. This is the same argument used by the function transition in the gdistance package (gdistance::transition).

The argument ‘timesteps’ requires a vector of strings and is used to name the files and to name the timesteps in the landscapes.rds landscape object created by the create_input_landscape function. It is not necessary to define this argument, but it helps readability when manipulating the object.

The argument ‘calculate_full_distance_matrices’ defines whether the create_input_landscape function will create one large distance matrix for all suitable sites (calculate_full_distance_matrices=TRUE) or a list of small distance matrices for the neighboring sites specified in ‘directions’ around each suitable site (calculate_full_distance_matrices=FALSE). The full distance matrix confers faster computing speed for the model, but requires a larger storage. The local distance matrices cause a slower computing speed, but require smaller storage. The slower speed emerges form the fact that a full distance matrix is then reconstructed by concatenating local distances matrices each time-step during a simulation run.

The real geographic coordinate system can be provided which allows for the global spherical distance correction. The reference system can be set by using the ‘crs’ argument of the create_input_landscape function. Here we are using the world geodetic system (WGS84) but providing a projected coordinate reference system is also possible. If you want to ensure that all cells have the same area you might consider an equal-area projection, such as the Lambert azimuthal equal-area projection. For real landscapes with valid coordinate systems (e.g. crs=“+proj=longlat +datum=WGS84”), the distance unit is in meters (m). If you would use hypothetical landscapes, such as in the design_landscape vignette, geographic distance would be measured in units of sites multiplied by the grid cell resolution.

Create input files

The create_input_landscape function creates .rds files that will be used as input for gen3sis. The file path where the landscapes and the distance matrices of all time steps should be stored can be specified within the function.

create_input_landscape(landscapes = landscapes_list, cost_function = cost_function_water,
    directions = 8, output_directory = file.path(tempdir(), "SouthAmerica"), timesteps = paste0(seq(65,
        0, by = -1), "Ma"), crs = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs",
    calculate_full_distance_matrices = F)

Finally, do not forget to specify the landscape metadata you just created. We recommend storing it in form of a METADATA.txt file within the landscape folder. A template is automatically created by the functions create_input_landscape at the correct location. Once the input landscape is ready, the run_simulation function can start the simulation using the path of the new landscape.


  1. O. Hagen, B. Flück, F. Fopp, J.S. Cabral, F. Hartig, M. Pontarp, T.F. Rangel, L. Pellissier. (2020). GENƎSIS: the GENeral Engine for Eco-Evolutionary SImulationS on the origins of biodiversity. (in review)↩︎