# GUILDS Vignette

## Guilds Vignette

Welcome to the vignette of GUILDS. GUILDS is an R package aimed at providing the user with easy to use functions that help dealing with the Unified Neutral Theory of Biodiversity and Biogeography. More specifically, the package bundles together the sampling functions for the Neutral Model including multiple guilds that differ in dispersal ability (Janzen et al. 2015), and the Etienne sampling formula for the standard neutral model. Furthermore, the package contains functions to generate artificial datasets, and to calculate the expected abundances.

## The standard neutral model

To generate data with the standard neutral model, we can make use of the function generate.ESF, where ESF stands for the Etienne Sampling Formula. generate.ESF makes use of the urn scheme described in Etienne (2005), where the avid reader can also find more details on the Etienne Sampling Formula. The ESF combines the universal biodiversity number theta, and the universal dispersal number I. Typically, the migration rate is a bit easier to interpret, so we will make use of that first, then convert it to I, and then generate the data:

library(GUILDS)
set.seed(42)
theta = 100
m <- 0.1
J <- 10000
I <- m * (J-1) / (1 - m)
abund <- generate.ESF(theta, I, J)
abund
##   [1] 875 238 212 177 176 172 159 158 149 145 144 126 124 122 117 112 111
##  [18] 110 110 109 108 108 106 101  99  95  93  92  86  80  80  79  78  77
##  [35]  76  75  70  68  66  63  62  60  60  59  59  58  57  57  56  56  55
##  [52]  54  54  53  52  51  51  50  48  47  46  46  45  44  44  43  43  43
##  [69]  43  42  42  41  40  40  39  39  39  37  37  37  36  36  35  35  34
##  [86]  34  34  33  33  31  31  30  30  30  30  30  29  29  28  28  28  27
## [103]  26  26  26  26  26  26  26  25  25  24  24  24  24  24  24  23  22
## [120]  22  22  21  21  21  21  20  20  20  20  20  20  20  20  19  19  19
## [137]  19  18  18  18  18  17  17  17  17  17  17  16  16  16  16  16  16
## [154]  15  15  15  15  15  14  14  14  13  13  13  12  12  12  12  12  12
## [171]  12  11  11  11  11  11  11  10  10  10  10  10   9   9   9   9   9
## [188]   9   9   9   8   8   8   8   8   8   8   8   8   8   8   8   8   7
## [205]   7   7   7   7   7   7   7   7   7   7   7   6   6   6   6   6   6
## [222]   6   6   5   5   5   5   5   5   5   5   5   5   5   5   5   5   5
## [239]   5   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4   4
## [256]   4   4   4   3   3   3   3   3   3   3   3   3   3   3   3   3   3
## [273]   3   3   3   3   3   3   3   3   2   2   2   2   2   2   2   2   2
## [290]   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2   2
## [307]   2   2   2   2   1   1   1   1   1   1   1   1   1   1   1   1   1
## [324]   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
## [341]   1   1   1

The package has now generated a nice species abundance distribution. Using the function preston_plot we can visualize our abundance distribution

preston_plot(abund)

We can even do better, using the function expected.SAD we can calculate the expected number of individuals per species. Furthermore, we can easily add this to our plot:

abund.expect <- expected.SAD(theta, m, J)
preston_plot(abund, abund.expect)

We can clearly see that the simulated abundance distribution closely matches the expected distribution (the black line), although the stochastic nature of the simulation has caused some small deviations. Using the Etienne Sampling Formula we can now estimate theta and m, using maximum likelihood. Because we already know the real values (those are the values that we used to simulate the data), we can nicely crosscheck whether the Etienne Sampling Formula yields good results. We start the maximum likelihood at three different starting points, to compare convergence.

LL1 <- maxLikelihood.ESF(init_vals = c(theta, m),
abund, verbose = FALSE)
LL2 <- maxLikelihood.ESF(init_vals = c(1000, 0.001),
abund, verbose = FALSE)
LL3 <- maxLikelihood.ESF(init_vals = c(10, 0.9),
abund, verbose = FALSE)
LL1$par ## [1] 102.0694382 0.1158251 LL2$par
## [1] 102.0695157   0.1158248
LL3$par ## [1] 102.0695036 0.1158248 Regardless of the starting point, we recover the same parameter estimates that are close to the real values theta = 100 and m = 0.1 . ## Dispersal Guilds The guilds model, as described in Janzen et al. (2015), asks a slightly different question from the standard neutral model. How would species abundances look if there would be two guilds of species, where the guilds would only differ in dispersal ability? Imagine for instance a forest, where some tree species disperse using wind-aided seed dispersal, and other tree species disperse using animal-aided seed dispersal. The model assumes that the two guilds share a local community, and the metacommunity, and share the same speciation dynamics (e.g. have the same theta). Dispersal is different however, and in contrast to the standard neutral model, where we would infer migration rates, here we asses the dispersal ability: the guilds model postulates that migration (m) is the product of the dispersal ability of the species (alpha) and the species’ frequency in the metacommunity (p). ### Conditioning on Guild size In the paper, we show that in order for parameter estimation to work reliably, we have to condition our sampling formula’s on guild size. The package also contains sampling formula’s without guild size, in order for the user to reproduce our train of thought, but here we will focus solely on situations where we have conditioned our sampling formula’s on guild size, and hence we assume that the user knows the number of individuals per guild. So, without verder ado, let’s simulate some data and then apply the relevant sampling formula’s. We start by simulating data using the “D0” model, which assumes no differences in dispersal ability between the guilds (e.g. the guilds model then reduces to the standard neutral model) set.seed(666) theta <- 200 alpha_x <- 0.1 J <- 10000 J_x <- 8000 J_y <- J - J_x abund <- generate.Guilds.Cond(theta, alpha_x, alpha_x, J_x, J_y) Again, we can calculate the expected distributions, and visualize our distributions. This time however, a small part of the calculation of the expected distribution can not be treated analytically, and has to be done via the means of simulation, so a larger number of replicates means a more accurate expected distribution. Here we will use a low number of replicates for computational reasons. abund.expected <- expected.SAD.Guilds.Conditional(theta, alpha_x, alpha_x, J_x, J_y, n_replicates = 5) par(mfrow = c(1, 2)) preston_plot(abund$guildX, abund.expected$guildX, main = "Guild X") preston_plot(abund$guildY, abund.expected$guildY, main = "Guild Y") Before, we set the size of Guild X to 8000 individuals, and the size of Guild Y to 2000 individuals. From the Preston plots we notice that indeed Guild X contains more species, but that apart from that, there are not many striking differences (we assumed equal dispersal of course!) Now that we have our data, we can apply the guilds’ sampling formula, and see whether it can accruately infer parameter values. LL <- maxLikelihood.Guilds.Conditional(init_vals = c(theta, alpha_x), model = "D0", method = "simplex", sadx = abund$guildX, sady = abund$guildY, verbose = TRUE) ## 1 200 0.1 -264.720114133117 initial ## 2 200 0.1 -264.720114133117 contract outside ## 3 200 0.1 -264.720114133117 contract outside ## 4 177.386363636364 0.101958955223881 -262.581552717838 contract inside ## 5 177.386363636364 0.101958955223881 -262.581552717838 reflect ## 6 177.386363636364 0.101958955223881 -262.581552717838 contract inside ## 7 177.386363636364 0.101958955223881 -262.581552717838 contract inside ## 8 177.386363636364 0.101958955223881 -262.581552717838 contract inside ## 9 180.56640625 0.100179279384328 -262.572663265651 contract inside ## 10 180.56640625 0.100179279384328 -262.572663265651 contract outside ## 11 178.600963245739 0.101527974142957 -262.57242464638 contract inside ## 12 178.600963245739 0.101527974142957 -262.57242464638 reflect ## 13 179.959106445312 0.10039476992479 -262.571312324591 contract inside ## 14 179.959106445312 0.10039476992479 -262.571312324591 reflect ## 15 179.959106445312 0.10039476992479 -262.571312324591 contract inside ## 16 179.959106445312 0.10039476992479 -262.571312324591 contract outside ## 17 179.959106445312 0.10039476992479 -262.571312324591 contract inside ## 18 179.959106445312 0.10039476992479 -262.571312324591 reflect ## 19 179.566173119978 0.100788605746938 -262.571293180934 contract inside ## 20 179.566173119978 0.100788605746938 -262.571293180934 contract inside ## 21 179.77416144176 0.100595867878466 -262.571287136587 contract inside ## 22 179.699121043086 0.100635692260381 -262.571286344549 contract outside ## 23 179.651407181201 0.100702192908181 -262.571285235653 contract inside ## 24 179.724712776952 0.100632405231373 -262.57128363581 contract inside ## 25 179.724712776952 0.100632405231373 -262.57128363581 reflect ## 26 179.724712776952 0.100632405231373 -262.57128363581 contract inside ## 27 179.724712776952 0.100632405231373 -262.57128363581 contract inside ## 28 179.691421583583 0.100669697250047 -262.571283509207 contract inside ## 29 179.69838885522 0.100664793273036 -262.571283502699 contract inside ## 30 179.709808998177 0.100649825246457 -262.571283473881 contract inside ## 31 179.697760255141 0.100663503254897 -262.571283469832 contract inside ## 32 179.697760255141 0.100663503254897 -262.571283469832 reflect ## 33 179.70030099084 0.100659116239183 -262.571283458415 contract outside ## 34 179.704105510544 0.100654922487679 -262.571283454763 contract inside ## 35 179.704105510544 0.100654922487679 -262.571283454763 contract inside ## 36 179.70378627262 0.10065606755766 -262.571283454129 reflect ## 37 179.70378627262 0.10065606755766 -262.571283454129 contract inside ## 38 179.703490278989 0.100655947674734 -262.571283453629 contract inside ## 39 179.703490278989 0.100655947674734 -262.571283453629 contract inside ## 40 179.703490278989 0.100655947674734 -262.571283453629 contract inside ## 41 179.703139586339 0.100656522469281 -262.571283453559 contract inside ## 42 179.703139586339 0.100656522469281 -262.571283453559 contract inside ## 43 179.703377648702 0.100656165891223 -262.571283453557 contract inside ## 44 179.703126784552 0.100656442614362 -262.571283453548 reflect ## 45 179.703195901483 0.100656413361037 -262.571283453542 contract inside ## 46 179.703195901483 0.100656413361037 -262.571283453542 contract inside ## Optimization has terminated successfully. LL$par
## [1] 179.7031959   0.1006564

We see that the maximum likelihood algorithm accurately infers the value of alpha, and gets relatively close to inferring the right value of theta. If we would want to infer the rate of migration, we could, in this case, also apply the standard neutral model (after all, there were no differences in dispersal, rendering the guilds construction obsolete):

LL1 <- maxLikelihood.ESF(init_vals = c(theta, alpha_x),
abund = c(abund$guildX, abund$guildY),
verbose = FALSE)
LL1$par ## [1] 246.0581528 0.2132826 Using the ESF, we find a slightly higher estimate for theta (but notice that the mean of the two estimates becomes really close to the value we put in), and we find a higher value for m. ### Guilds model with differences in dispersal So, what happens if we assume differences in dispersal? Let’s, for fun, assume that there are two guilds (guild X and Y), where guild X is much bigger, but where species from guild X have a much lower dispersal ability: set.seed(666+42) theta <- 200 alpha_x <- 0.001 alpha_y <- 0.01 J <- 10000 J_x <- 8000 J_y <- J - J_x abund <- generate.Guilds.Cond(theta, alpha_x, alpha_y, J_x, J_y) This generates the following dataset: abund.expected <- expected.SAD.Guilds.Conditional(theta, alpha_x, alpha_y, J_x, J_y, n_replicates = 5) par(mfrow = c(1, 2)) preston_plot(abund$guildX, abund.expected$guildX, main = "Guild X") preston_plot(abund$guildY, abund.expected$guildY, main = "Guild Y") Clearly, there are now strong differences in the community distributions: the larger guild has a much broader distribution, with much less singletons than the smaller guild. Again, we can use maximum likelihood to infer the parameters from the data: LL1 <- maxLikelihood.Guilds.Conditional(init_vals = c(theta, alpha_x, alpha_y), model = "D1", method = "simplex", sadx = abund$guildX, sady = abund$guildY, verbose = TRUE) ## 1 200 0.001 0.01 -215.576379473007 initial ## 2 200 0.001 0.0104947526236882 -215.177354840684 reflect ## 3 320.606060606061 0.000972251387430628 0.0105497251374313 -214.657621684671 reflect ## 4 250.252525252525 0.00100092495375231 0.0102290521405964 -214.432218905359 contract inside ## 5 250.252525252525 0.00100092495375231 0.0102290521405964 -214.432218905359 contract inside ## 6 248.298260381594 0.000997559149820287 0.0104400346123235 -214.388555929939 contract inside ## 7 248.298260381594 0.000997559149820287 0.0104400346123235 -214.388555929939 reflect ## 8 281.257092530241 0.000980591672576865 0.0104589384834332 -214.338277175635 contract inside ## 9 281.257092530241 0.000980591672576865 0.0104589384834332 -214.338277175635 contract outside ## 10 281.257092530241 0.000980591672576865 0.0104589384834332 -214.338277175635 contract outside ## 11 281.257092530241 0.000980591672576865 0.0104589384834332 -214.338277175635 contract inside ## 12 269.812622755331 0.000956124112197168 0.0107133224475725 -214.268426548817 expand ## 13 258.829576630503 0.000949100423065266 0.0105949275876465 -214.225758741046 expand ## 14 258.829576630503 0.000949100423065266 0.0105949275876465 -214.225758741046 reflect ## 15 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 expand ## 16 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 reflect ## 17 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 contract inside ## 18 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 contract outside ## 19 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 contract inside ## 20 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 contract outside ## 21 242.681041380888 0.000872926151789119 0.0111295828851692 -214.154232446708 contract inside ## 22 245.065666940604 0.000896169670215293 0.0111685099674622 -214.152657693258 expand ## 23 240.213046371479 0.000870897403973666 0.0114059371060566 -214.147303053613 expand ## 24 226.078764697484 0.000866955714338705 0.0117691436919721 -214.133636669629 expand ## 25 226.078764697484 0.000866955714338705 0.0117691436919721 -214.133636669629 reflect ## 26 220.167058633476 0.000851125102368059 0.0121256400924401 -214.132619275623 reflect ## 27 211.655687377266 0.000863215834205947 0.012368014944768 -214.126865327495 reflect ## 28 211.655687377266 0.000863215834205947 0.012368014944768 -214.126865327495 contract inside ## 29 205.094107046556 0.000852190136663376 0.012882239587373 -214.120851371611 expand ## 30 205.094107046556 0.000852190136663376 0.012882239587373 -214.120851371611 reflect ## 31 205.094107046556 0.000852190136663376 0.012882239587373 -214.120851371611 contract inside ## 32 208.262708202289 0.000865606064928349 0.0128473277279127 -214.119624422635 reflect ## 33 208.262708202289 0.000865606064928349 0.0128473277279127 -214.119624422635 contract inside ## 34 202.766399329288 0.000859048408061096 0.0130563321816033 -214.119152795254 contract outside ## 35 202.766399329288 0.000859048408061096 0.0130563321816033 -214.119152795254 contract inside ## 36 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 reflect ## 37 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 contract inside ## 38 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 reflect ## 39 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 contract outside ## 40 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 reflect ## 41 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 contract outside ## 42 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 contract inside ## 43 202.269648091563 0.000854180683794376 0.0131838059391843 -214.118220613235 contract inside ## 44 203.803284693549 0.00085720103719176 0.0130877765754275 -214.118216046192 reflect ## 45 203.803284693549 0.00085720103719176 0.0130877765754275 -214.118216046192 reflect ## 46 202.982010927649 0.000855815817097422 0.0131570884034905 -214.118209343679 contract inside ## 47 203.691479638795 0.000855829863995936 0.0131018903183343 -214.118205212709 contract outside ## 48 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 contract inside ## 49 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 contract inside ## 50 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 contract inside ## 51 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 reflect ## 52 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 contract inside ## 53 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 contract inside ## 54 202.880953255781 0.000855231461611374 0.0131496955191342 -214.118190261055 reflect ## 55 203.02721301981 0.000855758235862728 0.013141304845095 -214.11818864164 contract inside ## 56 203.02721301981 0.000855758235862728 0.013141304845095 -214.11818864164 contract inside ## 57 203.02721301981 0.000855758235862728 0.013141304845095 -214.11818864164 contract inside ## 58 203.02721301981 0.000855758235862728 0.013141304845095 -214.11818864164 contract inside ## 59 202.905429669316 0.000855578284573962 0.0131477585249655 -214.118188506017 contract inside ## 60 202.991575525591 0.000855551446082467 0.0131417826646082 -214.11818838707 reflect ## 61 203.009016016079 0.000855724686974044 0.0131413209510214 -214.11818834411 contract outside ## 62 202.997943378403 0.000855688187536443 0.0131424627793133 -214.118188314277 contract inside ## 63 203.046552625379 0.000855693018009496 0.0131389039349887 -214.118188248518 contract outside ## 64 203.004706432772 0.000855626705127897 0.0131413392765247 -214.1181882135 contract inside ## 65 203.004706432772 0.000855626705127897 0.0131413392765247 -214.1181882135 reflect ## 66 203.004706432772 0.000855626705127897 0.0131413392765247 -214.1181882135 contract inside ## 67 203.004706432772 0.000855626705127897 0.0131413392765247 -214.1181882135 contract inside ## 68 203.004706432772 0.000855626705127897 0.0131413392765247 -214.1181882135 contract inside ## 69 203.014777017188 0.000855653890826074 0.0131410107756224 -214.118188210491 contract inside ## 70 203.004562684612 0.000855624725261397 0.0131415279738647 -214.118188210271 contract outside ## 71 203.004562684612 0.000855624725261397 0.0131415279738647 -214.118188210271 reflect ## 72 203.005541965506 0.000855644464688305 0.01314154891766 -214.118188208012 contract outside ## 73 203.002372275543 0.000855639022671147 0.0131416304590367 -214.11818820745 contract inside ## 74 203.009467996204 0.000855644980849845 0.0131412899462381 -214.11818820699 contract inside ## 75 203.009467996204 0.000855644980849845 0.0131412899462381 -214.11818820699 contract inside ## 76 203.009467996204 0.000855644980849845 0.0131412899462381 -214.11818820699 reflect ## 77 203.004594501373 0.000855638312753417 0.0131415156887808 -214.118188206594 contract inside ## 78 203.006108715017 0.000855636538096179 0.0131414210524755 -214.1181882065 contract inside ## 79 203.006108715017 0.000855636538096179 0.0131414210524755 -214.1181882065 reflect ## 80 203.007896049703 0.000855642651199974 0.0131413525829242 -214.118188206363 contract inside ## 81 203.007896049703 0.000855642651199974 0.0131413525829242 -214.118188206363 reflect ## 82 203.008177796935 0.000855643792223798 0.0131413177335543 -214.118188206266 contract inside ## 83 203.007442454823 0.000855640213392801 0.0131413565811845 -214.11818820616 contract inside ## 84 203.007442454823 0.000855640213392801 0.0131413565811845 -214.11818820616 contract outside ## 85 203.006937972873 0.000855640497268451 0.0131413705883887 -214.118188206155 reflect ## 86 203.006937972873 0.000855640497268451 0.0131413705883887 -214.118188206155 contract inside ## 87 203.008709630252 0.000855641410558953 0.0131412547538567 -214.11818820602 expand ## 88 203.008709630252 0.000855641410558953 0.0131412547538567 -214.11818820602 contract outside ## 89 203.008709630252 0.000855641410558953 0.0131412547538567 -214.11818820602 reflect ## 90 203.008709630252 0.000855641410558953 0.0131412547538567 -214.11818820602 reflect ## 91 203.008709630252 0.000855641410558953 0.0131412547538567 -214.11818820602 reflect ## 92 203.008310605236 0.000855642108676991 0.0131412672163526 -214.118188206015 reflect ## 93 203.009730175428 0.000855642899073701 0.0131411842944577 -214.118188205987 reflect ## 94 203.009730175428 0.000855642899073701 0.0131411842944577 -214.118188205987 contract inside ## 95 203.009559356114 0.000855643444649698 0.0131411786169648 -214.118188205986 reflect ## 96 203.009559356114 0.000855643444649698 0.0131411786169648 -214.118188205986 contract inside ## 97 203.009450771712 0.000855643614621832 0.0131411948517376 -214.118188205983 reflect ## 98 203.009450771712 0.000855643614621832 0.0131411948517376 -214.118188205983 contract inside ## 99 203.009450771712 0.000855643614621832 0.0131411948517376 -214.118188205983 contract inside ## 100 203.009450771712 0.000855643614621832 0.0131411948517376 -214.118188205983 contract inside ## 101 203.009450771712 0.000855643614621832 0.0131411948517376 -214.118188205983 contract inside ## 102 203.009311762459 0.000855643368402712 0.0131412001328795 -214.118188205982 reflect ## 103 203.009311762459 0.000855643368402712 0.0131412001328795 -214.118188205982 reflect ## 104 203.009381267086 0.000855643491512272 0.0131411974923085 -214.118188205978 shrink ## Optimization has terminated successfully. Maximum likelihood estimation shows that we can accurately infer the underlying parameters. Furthermore, we can test whether the model assuming differences in dispersal between guilds better explains the data, than the model without. We do this by also fitting the model without differences to the data, and compare the estimated likelihoods: LL2 <- maxLikelihood.Guilds.Conditional(init_vals = c(theta, alpha_x), model = "D0", method = "simplex", sadx = abund$guildX, sady = abund$guildY, verbose = FALSE) To do an honest comparison, we have to correct for the extra parameter (alpha_y) in the “D1” model, we do this by calculating the AIC value: AIC_D1 = 2 * 3 - 2 * LL1$fvalues
AIC_D0 = 2 * 2 - 2 * LL2\$fvalues
AIC_D1
## [1] 434.2364
AIC_D0
## [1] 538.7096`

Clearly, the AIC of the D0 model is much lower, indicating a much better fit - as expected.

This concludes the vignette. Go out there, find abundance data and check whether the data has guild structure! If not, use the ESF, but if you can come up with a potential guild structure, apply the conditional guilds sampling formula! Don’t forget to apply both the model with, and without differences in dispersal ability, in order to verify that the guilds really differ in dispersal ability!