Assessment methodology for contaminants in biota

Overview

Time series of contaminant concentrations in biota are assessed in two stages:

1. The concentrations are log transformed and changes in the log concentrations over time are modelled using linear mixed models. The type of temporal change that is considered depends on the number of years of data:
• 1-2 years: no model is fitted because there are insufficient data
• 3-4 years: concentrations are assumed to be stable over time and the mean log concentration is estimated
• 5-6 years: a linear trend in log concentration is fitted
• 7+ years: more complex (smooth) patterns of change over time are modelled
2. The fitted models are used to assess:
• environmental status and human health status against available assessment criteria
• evidence of temporal change in contaminant levels in the last twenty years

These stages are described in more detail below. Other help files describe how the methodology is adapted when there are ‘less-than’ measurements, i.e. some concentrations are reported as below the detection limit, and missing uncertainties, i.e. the analytical variability associated with some of the concentration measurements was not reported. Changes to the methodology since the 2014 assessment can be found here.

Modelling changes in log concentration over time

The log concentrations are modelled by a linear mixed model of the form:

• response: log concentration
• fixed: f(year)
• random: year + sample + analytical

The fixed effects model describes how log concentrations change over time (year), where the form of f(year) depends on the number of years of data (described in the next paragraph). The random effects model has three components:

• year: random variation in log concentration between years. Here, year is treated as a categorical variable
• sample: random variation in log concentration between samples within years. When there is only one sample each year, this term is omitted and implicitly subsumed into the between-year variation
• analytical: random variation inherent in the chemical measurement process. This is assumed known and derived from the the ‘uncertainties’ reported with the data. Specifically, if $$u_i$$, $$i = 1...n$$, are the uncertainties associated with concentrations $$c_i$$ (expressed as the standard deviations of the concentration measurements), then the standard deviations of the log concentration measurements $$\text{log } c_i$$ are taken to be $$u_i/c_i$$. Measurements with $$u_i > c_i$$ (i.e. an analytical coefficient of variation of more than 100%) are omitted from the time series.

The model is fitted by maximum likelihood assuming each of the random effects are independent and normally distributed (on the log concentration scale)1.

The form of f(year) depends on the number of years of data:

1-2 years
no model is fitted as there are too few years for formal statistical analysis
3-4 years
mean model $$\text{f(year)} = \mu$$
there are too few years for a formal trend assessment, but the mean level is summarised by $$\mu$$ and is used to assess status
5-6 years
linear model $$\text{f(year)} = \mu + \beta\text{year}$$
log concentrations are assumed to vary linearly with time; the fitted model is used to assess status and evidence of temporal change
7+ years
smooth model $$\text{f(year) = s(year)}$$
log concentrations are assumed to vary smoothly over time; the fitted model is used to assess status and evidence of temporal change

The last case requires more explanation. When there are 7-9 years of data, both a linear model and a smoother (thin plate regression spline) on 2 degrees of freedom (df) are fitted to the data. Of these, the model chosen to make inferences about status and temporal trends is the one with the lower Akaike’s Information Criterion corrected for small sample size (AICc)2. When there are 10-14 years of data, a linear model and smoothers on 2 and 3 df are fitted, with the chosen model that with the lowest AICc. And when there are 15+ years of data, a linear model and smoothers on 2, 3, and 4 df are fitted, with model selection again based on AICc. Effectively, the data determine the amount of smoothing, with AICc providing an appropriate balance between model fit and model parsimony3.

Assessing status and temporal trends

Environmental status, human health status and temporal trends are assessed using the model fitted to the concentration data.

Environmental status and human health status are assessed by:

• calculating the upper one-sided 95% confidence limit on the fitted mean log concentration in the most recent monitoring year4
• back-transforming this to the concentration scale
• comparing the back-transformed upper confidence limit to the available assessment criteria

For example, if the back-transformed upper confidence limit is below the Background Assessment Concentration (BAC), then the median concentration in the most recent monitoring year is significantly below the BAC and concentrations are said to be ‘at background’. For an example, see Fryer & Nicholson (1999).

No formal assessment of status is made when there are only 1 or 2 years of data. However, an ad-hoc assessment is made by:

• calculating the median of the log concentration measurements in each year
• back-transforming these to the concentration scale
• comparing the back-transformed median log concentration (1 year) or the larger of the two back-transformed median log concentrations (2 years) to the assessment criteria.

Temporal trends are assessed for all time series with at least five years of data. When a linear model has been fitted (i.e. when there are 5-6 years of data, or if there are 7+ years of data and no evidence of nonlinearity), the statistical significance of the temporal trend is obtained from a likelihood ratio test5 that compares the fits of the linear model $$f(\text{year}) = \mu + \beta\text{year}$$ and the mean model $$f(\text{year}) = \mu$$. The summary maps show a downward or upward trend if the trend is significant at the 5% significance level.

When a smooth model has been fitted, a plot of the fitted model is needed to understand the overall pattern of change. (This is available in the html report that is generated by clicking on the station in the summary map.) The summary map focusses on just one aspect of the change over time: the change in concentration in the most recent twenty monitoring years; i.e. between 2002 and 2021 (the assessment only includes data up to 2021). For this, the fitted value of the smoother in 2021 is compared to the fitted value in 2002 using a t-test, with significance assessed at the 5% level. The correlation between the two fitted values is accounted for by the t-test. If the time series does not extend to 2021, then the fitted value in the last monitoring year is used instead. Similarly, if the time series starts after 2002, the fitted value in the first monitoring year is used.

Fryer RJ & Nicholson MD, 1999. Using smoothers for comprehensive assessments of contaminant time series in marine biota. ICES Journal of Marine Science 56: 779-790.

1 Such models cannot be readily fitted in the R statistical environment because the analytical variance is assumed know. Instead, the likelihood is maximised directly using the optim function. Ideally, the models should be fitted by restricted maximum likelihood (apart from when being used for likelihood ratio tests), but this has not been implemented yet. ↩︎

2 AICc is a model selection criterion that gives greater protection against overfitting than AIC when the sample size is small. For contaminant time series, small sample sizes correspond to few years of data. AICc is not formally defined for mixed models, but the usual definition is adapted to give a sensible criterion for the models considered here. The usual definition of AICc is

$\text{AICc = - 2 log likelihood} + 2 k n / (n - k - 1)$

where $$n$$ is the sample size and $$k$$ is the number of parameters in the model. For a contaminant time series, the natural definition of the sample size is the number of years of data, $$N$$, say. The number of parameters in the number of fixed effects parameters, $$k_\text{fixed}$$, plus the number of (unknown) variance parameters, $$k_\text{random}$$. For example, the linear model has $$k_\text{fixed}$$ = 2 and $$k_\text{random}$$ = 2 (or 1 if the sample variance component is subsumed into the year variance component). This suggests using

$\text{AICc = - 2 log likelihood} + 2 (k_\text{fixed} + k_\text{random}) N / (N - k_\text{fixed} - k_\text{random} - 1)$

However, the denominator now overly penalises models because the ‘sample size’ is the number of years and, whilst subtracting $$k_\text{random}$$ correctly corrects for the year variance component, it also corrects for the sample variance component which measures within-year variation. (Indeed, the denominator = 0 if $$N$$ = 5 and the linear model is fitted, or $$N$$ = 3 or 4 and the mean model is fitted). It therefore makes sense to take $$k_\text{random}$$ in the denominator to be 1, corresponding to the year variance component, giving

$\text{AICc = - 2 log likelihood} + 2 (k_\text{fixed} + k_\text{random}) N / (N - k_\text{fixed} - 2)$

The denominator is now analogous to that used in a linear model with a single normally distributed error term. The AICc is still undefined when $$N$$ = 3 and the mean model is fitted, but this doesn’t matter in practice. ↩︎

3 Methods for estimating the smoothing degrees of freedom as part of the fitting process, for example by treating the amount of smoothing as an extra variance component, are available for several classes of models. However, such methods are not implemented in R for the case when the residual variance (the $$\text{analytical}$$ variance) is known. This is a topic for future development. ↩︎

4 Approximate standard errors on the fixed effects parameter estimates are obtained from the Hessian matrix. These are used to estimate standard errors on the fitted values, with confidence intervals based on a t-distribution with $$N$$ - $$k_\text{fixed}$$ - 1 degrees of freedom. One-sided t-tests of whether the fitted value in the last monitoring year is below the assessment criteria can be found on the Statistical analysis page on the right hand side of the summary map under Graphics. The standard errors can be computed analytically (i.e. without using the Hessian), but this hasn’t been implemented yet. The degrees of freedom for the t-tests is a sensible approximation because, for time series models, the natural definition of the ‘sample size’ is $$N$$, the number of years of data (see discussion on AICc above). However, if the year variance is small compared to the other variances, the degrees of freedom might be too small leading to a loss of statistical power. This is a topic for future development. ↩︎

5 These tests have a type 1 error that is larger than the nominal value. For example, tests conducted at the 5% significance level will find ‘significant’ trends in more than 5% of time series, even when there are no trends. Using the standard error of the estimate of $$\beta$$ from a restricted maximum likelihood fit of the linear model would be one way to improve the situation. Better still would be to use the Kenward Roger modification of F tests for linear mixed models (Kenward MG & Roger JH, 1997; Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood, Biometrics 53: 983-997). ↩︎