---
title: "Computing Annual Trends with fasstr"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Computing Annual Trends with fasstr}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r options, include=FALSE}
knitr::opts_chunk$set(eval = nzchar(Sys.getenv("hydat_eval")))
```
```{r, include=FALSE}
library(fasstr)
```
`fasstr`, the Flow Analysis Summary Statistics Tool for R, is a set of [R](https://www.r-project.org/) functions to tidy, summarize, analyze, trend, and visualize streamflow data. This package summarizes continuous daily mean streamflow data into various daily, monthly, annual, and long-term statistics, completes trending and frequency analyses, with outputs in both table and plot formats.
This vignette documents the usage of the `compute_annual_trends()` function in `fasstr`. This vignette is a high-level adjunct to the details found in the function documentation (see `?compute_annual_trends()`). Youâ€™ll learn what arguments to provide to the function to customize your analysis, what analyses are computed, and what outputs are produced.
## Overview
Determining trends in streamflow data can provide information on potential changes in hydrological processes over time. The annual trending analysis with `fasstr` allows for customization of both the inputs and outputs. This function takes up to 107 annual streamflow metrics (calculated using various annual `fasstr` functions) and calculates prewhitened, non-parametric trends using the Mann-Kendall test performed using the [`zyp`](https://CRAN.R-project.org/package=zyp) R package. See the `zyp` [documentation](https://CRAN.R-project.org/package=zyp/zyp.pdf) for more information on the methods.
Each annual metric/time-series is analyzed for trends using trend-free prewhitening to remove lag-1 correlation (may artificially detect a significant trend). The slope of each metric over time is then estimated using the Theil-Sen approach. If the estimated slope is different from zero, then the data are detrended by the slope and the AR(1) 1s calculated for the detrended time series. The residuals and the trend are combined and then tested for significance using the Mann-Kendall trend test.
The trending function results in a list containing two tibble data frame outputs and, if selected, plot for each metric trended.
1) **Annual_Trends_Data** - data used for analysis calculated from various annual `fasstr` functions
2) **Annual_Trends_Results** - results of the `zyp` trending analysis, and includes various other statistics
3) **'Sep_Maximum'** - an example of each of 107 plots produced (one for each metric)
## Function and Data Inputs
- `compute_annual_trends()`
To determine annual trends from a daily streamflow data set, the `compute_annual_trends()` function will take daily data, either from HYDAT using the `station_number` argument or your own data frame of data using the `data` argument to complete the analysis. To complete a custom trends analysis of data please see the `zyp` functions for more information.
## Usage, Options, and Outputs
### Analysis Data
This function is provided to calculate trends on a multitude of annual metrics, as calculate by various annual and monthly `fasstr` functions. The functions will calculate metrics from each of the following functions:
- `calc_annual_stats()` - calculate annual summary statistics
- `calc_annual_cumulative_stats()` - calculate annual and seasonal cumulative flows, both volume and yield
- `calc_annual_flow_timing()` - calculate annual flow timing
- `calc_annual_lowflows()` - calculate annual lowflows
- `calc_annual_normal_days()` - calculate annual days above and below normal
- `calc_monthly_stats()` - calculate annual monthly summary statistics
While each of the different metrics have default variables for their arguments, many of them can be customized. The following table shows which arguments are used for which statistics and what the defaults are. See the documentation for more information.
Argument | Corresponding Function | Default
---------------------|----------------------|----------------------------------------------
`annual_percentiles` | `calc_annual_stats()` | `c(10,90)`
`monthly_percentiles` | `calc_monthly_stats()` | `c(10,20)`
`stats_days` | `calc_annual_stats()` & `calc_monthly_stats()` | `1`
`stats_align` | `calc_annual_stats()` & `calc_monthly_stats()` | `"right"`
`lowflow_days` | `calc_annual_lowflows()` | `c(1,3,7,30)`
`lowflow_align` | `calc_annual_lowflows()` | `"right"`
`timing_percent` | `calc_annual_flow_timing()` | `c(25,33.3,50,75)`
`normal_percentiles` | `calc_annual_normal_days()` | `c(25,75)`
With fasstr version 0.4.0, the `months` argument is now included in the trending function to specify which months of the year to include for trending. For example, selecting `months = 7:9` means that all annual and monthly statistics will be calculated just from July through September to be tested for trends. This gives the user more flexibility to trend more statistics. Since selecting months may complicate seasonal totals, seasonal yields and seasonal volumes are not included in the results if not all 12 months are selected.
#### Examples
Example with default arguments:
```{r, eval=FALSE}
compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
start_year = 1973, end_year = 2013)
```
Example with custom arguments:
```{r, eval=FALSE}
compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
start_year = 1973, end_year = 2013,
annual_percentiles = c(10,90),
monthly_percentiles = c(10,20),
stats_days = 1,
stats_align = "right",
lowflow_days = c(1,3,7,30),
lowflow_align = "right",
timing_percent = c(25,33,50,75),
normal_percentiles = c(25,75))
```
Example with custom months arguments that will trend data only from May through September:
```{r, eval=FALSE}
compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
start_year = 1973, end_year = 2013,
months = 5:9)
```
This annual data is provided in the **Annual_Trends_Data** tibble objects. The following is an example of the output, including all the annual metrics and a first few years of data used for the `zyp` trends analysis:
```{r, comment=NA, echo=FALSE}
trends <- compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
start_year = 1973, end_year = 2013)
data <- as.data.frame(trends[[1]])[,2:5]
data[2] <- round(data[2],3)
data[3] <- round(data[3],3)
data[4] <- round(data[4],3)
data
```
To provide examples of the outputs, an analysis will be completed on a Mission Creek HYDAT station from 1973 to 2013. The argument `zyp_method` is described below in the Analysis Results section:
```{r, echo=TRUE, include=TRUE}
trends_analysis <- compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
start_year = 1973, end_year = 2013)
```
The following is an example of the outputted **Annual_Trends_Data** tibble:
```{r, echo=TRUE, comment=NA,eval=FALSE}
trends_analysis$Annual_Trends_Data
```
```{r, comment=NA, echo=FALSE}
data.frame(head(
trends_analysis$Annual_Trends_Data
))
```
### Analysis Results
To complete a trends analysis, a variable to the `zyp_method` argument must be provided, either `"zhang"` or `"yuepilon"`, designating the two different approaches to analyzing data for trends. The `zhang` method is recommended for hydrologic applications over `yuepilon` (see `zyp` documentation for more information on the differences between the two methods). After running the function, the results of the trending analysis will be outputted in the **Annual_Trends_Results** tibble data frame. See the `zyp` documentation for how to interpret the results. The results tibble contains the following columns:
Column Name | Description
--------------|---------------------------------------------------------------------
Statistic | the annual statistic used for trending
lbound | the lower bound of the trend's confidence interval (`zyp`)
trend | the Sens' slope (trend) per unit time (`zyp`)
trendp | the Sen's slope (trend) over the time period (`zyp`)
ubound | the upper bound of the trend's confidence interval (`zyp`)
tau | Kendall's tau statistic computed on the final detrended timeseries (`zyp`)
sig | Kendall's P-value computed for the final detrended timeseries (`zyp`)
nruns | the number of runs required to converge upon a trend (`zyp`)
autocor | the autocorrelation of the final detrended timeseries (`zyp`)
valid_frac | the fraction of the data which is valid (not NA) once autocorrelation is removed (`zyp`)
linear | the least squares fit trend on the same data (`zyp`)
intercept | the intercept of the Sen's slope (trend) (`zyp`)
min_year | the minimum year used in the trending
max_year | the maximum year used in the trending
n_years | the number of years with data for trending
mean | the mean of all values used for trending
median | the median of all values used for trending
min | the minimum of all values used for trending
max | the maximum of all values used for trending
The following is an example of the outputted **Annual_Trends_Results** tibble from the Mission Creek HYDAT station from 1973 to 2013:
```{r, echo=TRUE, comment=NA, eval=FALSE}
trends_analysis$Annual_Trends_Results
```
```{r, comment=NA, echo=FALSE}
data.frame(head(
trends_analysis$Annual_Trends_Results
))
```
### Annual Trends Plots
To provide the ability to visualize the trends, a time-series plot for each metric is provided when `include_plots = TRUE` (default; set it to `FALSE` to produce no plots). Each plot will show the annual value for all years and if a numerical `zyp_alpha` value, a significance level indicating the trend exists, is provided (typically `0.05`) then a trend line of the calculated Sen's Slope will also be plotted through the data. To plot no slopes, set `zyp_alpha = NA` (default) and to plot the lines regardless of significance, set `zyp_alpha = 1`. The metric name along with the significance level will be included as the title of the plot.
The following plots demonstrate examples of where the `zyp_alpha` value is set to 0.05 and Sen's Slopes trends are not and are significant, respectively.
```{r, echo = FALSE, include=FALSE}
trends <- compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang", zyp_alpha = 0.05,
start_year = 1973, end_year = 2013)
```
```{r, echo=FALSE, comment=NA, fig.height = 3, fig.width = 7}
trends[[51]]
```
```{r, echo=FALSE, comment=NA, fig.height = 3, fig.width = 7}
trends$`Sep_Maximum`
```