sdmpredictors quickstart guide

Samuel Bosch

2016-09-26

The goal of sdmpredictors is to make environmental data, commonly used for species distribution modelling (SDM), also called ecological niche modelling (ENM) or habitat suitability modelling, easy to use in R. It contains methods for getting metadata about the available environmental data for the current climate but also for future and paleo climatic conditions. A way to download the rasters and load them into R and some general statistics about the different layers.

Getting the metadata

Different list_* functions are available in order to find out which datasets and environmental layers can be downloaded.

list_datasets

With the list_datasets function you can view all the available datasets. If you want only terrestrial datasets then you have to set the marine parameter to FALSE and vice versa.

library(sdmpredictors)

# exploring the marine datasets 
datasets <- list_datasets(terrestrial = FALSE, marine = TRUE)
dataset_code terrestrial marine url description citation
Bio-ORACLE FALSE TRUE http://www.oracle.ugent.be/ Bio-ORACLE is a set of GIS rasters providing marine environmental information for global-scale applications. It offers an array of geophysical, biotic and climate data at a spatial resolution 5 arcmin (9.2 km) in the ESRI ascii format. Tyberghein L., Verbruggen H., Pauly K., Troupin C., Mineur F. & De Clerck O. Bio-ORACLE: a global environmental dataset for marine species distribution modeling. Global Ecology and Biogeography. http://dx.doi.org/10.1111/j.1466-8238.2011.00656.x
MARSPEC FALSE TRUE http://marspec.org/ MARSPEC is a set of high resolution climatic and geophysical GIS data layers for the world ocean. Seven geophysical variables were derived from the SRTM30_PLUS high resolution bathymetry dataset. These layers characterize the horizontal orientation (aspect), slope, and curvature of the seafloor and the distance from shore. Ten bioclimatic variables were derived from NOAA’s World Ocean Atlas and NASA’s MODIS satellite imagery and characterize the inter-annual means, extremes, and variances in sea surface temperature and salinity. These variables will be useful to those interested in the spatial ecology of marine shallow-water and surface-associated pelagic organisms across the globe. Note that, in contrary to the original MARSPEC, all layers have unscaled values. Sbrocco, EJ and Barber, PH (2013) MARSPEC: Ocean climate layers for marine spatial ecology. Ecology 94: 979. http://dx.doi.org/10.1890/12-1358.1

list_layers

Using the list_layers we can view all layer information based on datasets, terrestrial (TRUE/FALSE), marine (TRUE/FALSE) and/or whether it should be monthly data. The table only shows the first 4 columns of the first 3 layers.

# exploring the marine layers 
layers <- list_layers(datasets)
dataset_code layer_code name description
Bio-ORACLE BO_calcite Calcite (mean) Calcite concentration indicates the mean concentration of calcite (CaCO3) in oceans.
Bio-ORACLE BO_chlomax Chlorophyll A (maximum) Chlorophyll A concentration indicates the concentration of photosynthetic pigment chlorophyll A (the most common green chlorophyll) in oceans. Please note that in shallow water these values may reflect any kind of autotrophic biomass.
Bio-ORACLE BO_chlomean Chlorophyll A (mean) Chlorophyll A concentration indicates the concentration of photosynthetic pigment chlorophyll A (the most common green chlorophyll) in oceans. Please note that in shallow water these values may reflect any kind of autotrophic biomass.

Loading the data

load_layers

To be able to use the layers you want in R you have to call the load_layers function with

# download first two layers (BO_calcite, BO_chlomax) 
load_layers(layers[1:2,])
## class       : RasterStack 
## dimensions  : 2160, 4320, 9331200, 2  (nrow, ncol, ncell, nlayers)
## resolution  : 0.08333333, 0.08333333  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## names       : BO_calcite, BO_chlomax 
## min values  :      5e-05,      1e-02 
## max values  :   0.055976,  64.565000
# (down)load specific layers 
specific <- load_layers(c("BO_calcite", "BO_chlomax", "MS_bathy_5m"))

# to load equal area data (Behrmann equal area projection) 
equalarea <- load_layers("BO_sstmean", equalarea = TRUE)

# get the default on disc storage location
# use the datadir parameter in load_layers 
# or set the sdmpredictors_datadir option to change it 
sdmpredictors:::get_datadir(NULL) 
## [1] "D:/Swbosch/R/sdmpredictors"

Loading future and paleo data

Similarly to the current climate layers

# exploring the available future marine layers 
future <- list_layers_future(terrestrial = FALSE) 
# available scenarios 
unique(future$scenario) 
## [1] "A1B" "A2"  "B1"
unique(future$year)
## [1] 2100 2200
paleo <- list_layers_paleo(terrestrial = FALSE)
unique(paleo$epoch) 
## [1] "Last Glacial Maximum" "mid-Holocene"
unique(paleo$model_name) 
## [1] "21kya_geophysical"      "21kya_ensemble_noCCSM" 
## [3] "21kya_ensemble_adjCCSM" "6kya_Ensemble"

Other functions related to layers metadata and future and paleo layers are:

get_layers_info(c("BO_calcite","BO_B1_2100_sstmax","MS_bathy_21kya"))$common[,1:4]
##       time dataset_code        layer_code
## 69 current   Bio-ORACLE        BO_calcite
## 17  future   Bio-ORACLE BO_B1_2100_sstmax
## 3    paleo      MARSPEC    MS_bathy_21kya
##                                 name
## 69                    Calcite (mean)
## 17 Sea surface temperature (maximum)
## 3                         Bathymetry
# functions to get the equivalent future layer for a current climate layer 
get_future_layers(c("BO_sstmax", "BO_salinity"), 
                  scenario = "B1", 
                  year = 2100)$layer_code 
## [1] "BO_B1_2100_salinity" "BO_B1_2100_sstmax"
# functions to get the equivalent paleo layer for a current climate layer 
get_paleo_layers(c("MS_bathy_5m", "MS_biogeo13_sst_mean_5m"), 
                 model_name = c("21kya_geophysical", "21kya_ensemble_adjCCSM"), 
                 years_ago = 21000)$layer_code 
## [1] "MS_bathy_21kya"                    
## [2] "MS_biogeo13_sst_mean_21kya_adjCCSM"

Statistics

Two types of statistics are available for the current climate layers:

# looking up statistics and correlations for marine annual layers
datasets <- list_datasets(terrestrial = FALSE, marine = TRUE)
layers <- list_layers(datasets)

# filter out monthly layers
layers <- layers[is.na(layers$month),]

layer_stats(layers)[1:2,]
##     layer_code minimum    q1 median    q3 maximum      mad      mean
## 1  BO_bathymax   -9906 -4748  -3948 -2784    2361 1361.027 -3525.239
## 2 BO_bathymean  -10494 -4876  -4098 -3005    1721 1307.653 -3675.429
##         sd     moran      geary
## 1 1651.386 0.9670128 0.01461685
## 2 1645.478 0.9706400 0.01073758
correlations <- layers_correlation(layers)

# create groups of layers where no layers in one group 
# have a correlation > 0.7 with a layer from another group 
groups <- correlation_groups(correlations, max_correlation=0.7)

# group lengths
sapply(groups, length)
##  [1]  1  7 15  1  6  3  4  1  1  3  1  1
for(group in groups) {
  if(length(group) > 1) {
    cat(paste(group, collapse =", "))
    cat("\n")
  }
}
## BO_chlomax, BO_chlomean, BO_chlomin, BO_chlorange, BO_damax, BO_damean, BO_damin
## BO_cloudmax, BO_cloudmean, BO_cloudmin, BO_parmax, BO_parmean, BO_dissox, BO_nitrate, BO_phosphate, BO_sstmax, BO_sstmean, BO_sstmin, MS_biogeo13_sst_mean_5m, MS_biogeo14_sst_min_5m, MS_biogeo15_sst_max_5m, BO_silicate
## BO_salinity, MS_biogeo08_sss_mean_5m, MS_biogeo09_sss_min_5m, MS_biogeo10_sss_max_5m, MS_biogeo11_sss_range_5m, MS_biogeo12_sss_variance_5m
## BO_sstrange, MS_biogeo16_sst_range_5m, MS_biogeo17_sst_variance_5m
## BO_bathymin, BO_bathymax, BO_bathymean, MS_bathy_5m
## MS_biogeo03_plan_curvature_5m, MS_biogeo07_concavity_5m, MS_biogeo04_profile_curvature_5m