Ideally, imposex data are submitted as individual measurements; for example, as the Vas Deferens Sequence (VDS) of each female snail. This provides information about variation between individuals, and allows efficient statistical models to be fitted to assess trends and status. However, sometimes the data are submitted as an annual index; for example, as the Vas Deferens Sequence Index (VDSI), the arithmetic mean of the individual VDS measurements. A more ad-hoc modelling approach is then all that is possible. This help file describes the methodology for assessing time series of individual VDS measurements. Other help files describe the approach when VDS data are submitted as annual indices, or as a mixture of individual measurements and annual indices.

For some species, imposex stage or intersex stage are reported rather than VDS, again either as individuals or annual indices^{1}. These data are assessed using the same methods as for VDS.

Changes to the methodology since the 2014 assessment can be found here.

The individual VDS measurements are modelled with a proportional odds model (McCullagh & Nelder, 1989). Let \(y_{ij}\) be the VDS measurement of the \(j\)th female snail in year \(t_i, i = 1...N\), with \(y_{ij} \in \left \{ 0, ..., K \right \}\) where \(K\) is the highest possible VDS class^{2}. It is assumed that

\[\text {logit} \left( \text {Prob} (y_{ij} \le k) \right) = f(t_i) + \theta_k\]

for \(0 \le k \le K-1\), with \(\text {Prob} (y_{ij} \le K) = 1\). Here, \(f(t)\) is a function that describes how imposex levels change over time (year). Various forms of \(f(t)\) are considered and these are discussed in the next section. The \(\theta_k\) are cut points that measure the odds of being in a particular VDS class or below. Since the classes are ordered, the cut points are subject to the constraints \(\theta_0 < \theta_1 < ... < \theta_{K-1}\)^{3}.

The model is fitted by maximum likelihood. However, there are rarely sufficient data in a single time series to estimate the cut points precisely, so the cut points are first estimated from a saturated model fitted to multiple time series (described later) and are then assumed fixed and known^{4}. The only parameters estimated when fitting the proportional odds model to a single time series are thus the parameters of \(f(t)\). Parameter standard errors are estimated from the Hessian matrix^{5}.

McCullagh P & Nelder JA, 1989. Generalized Linear Models (second edition). Chapman & Hall, London.

Several different candidate forms of \(f(t)\) are considered, depending on the length of the time series. As well as linear logistic and smooth trends, change-point models are considered. These are motivated by the patterns seen in many time series, where there are steep declines in VDS levels starting in the mid 2000s. These changes coincide with the introduction of EC Regulation 782/2003, which implemented the provisions of the International Maritime Organisation’s Antifouling Systems Convention (IMO, 2001) prohibiting application of TBT surface coatings to all vessels by 2003, and the global ban on TBT which came into force in September 2008. The steep declines usually cannot be described adequately by linear logisitic models, or even smoothers. However, change-point models provide a way of capturing the steep decline with relatively few parameters. The years 2004, 2005, 2006, 2007 and 2008 are regarded as potential change-years, since the environmental response to the TBT measures is likely to have started in this period.

Intuitively, the complexity of the candidate forms of \(f(t)\) should be based on the number of years of data, \(N\). For example, with 8 years of data, one might consider a linear model \(f(t) = \mu + \beta t\) and a smoother \(f(t) = s(t)\) on 2 degrees of freedom (df) (analogous to the models fitted to contaminant time series). However, this runs into numerical difficulties when a time series starts with a series of years in which all VDS measurements equal the maximum value \(K\), or ends with a series of years in which all VDS measurements equal 0, as the amount of information in the data for estimating \(f(t)\) is then reduced^{6}. Instead, the candidate forms of \(f(t)\) are based on \(N_\text{mid}\), an approximate measure of the number of years of data that contain information about changes in VDS levels. Loosely, \(N_\text{mid}\) is the number of years in the ‘middle’ of the timeseries, where intermediate VDS levels are observed. Formally, \(N_\text{mid}\) is defined as follows. Let \(I_i\) = 1 if all the VDS measurements in year \(t_i\) equal \(K\), -1 if all the VDS measurements in year \(t_i\) equal 0, and 0 otherwise. Let

\[i_1 = \begin{cases} 1, & \text{if } I_1 < 1 \\ N, & \text{if } I_i = 1 \forall i \\ \text {min} \left \{ i: I_{i+1} < 1 \right \}, & \text{otherwise} \end{cases}\]

Similarly, let

\[i_2 = \begin{cases} N, & \text{if } I_N > -1 \\ 1, & \text{if } I_i = -1 \forall i \\ \text {max} \left \{ i: I_{i-1} > -1 \right \}, & \text{otherwise} \end{cases}\]

Then \(\{ t_i, i = i_1, ..., i_2 \}\) are the ‘middle’ years of the time series and \(N_\text{mid} = i_2 - i_1 + 1\).

The linear and smooth candidate forms of \(f(t)\) are then

- \(N \leq 2\)
- no model is fitted
- \(N \ge 3\) and \(N_\text{mid} = 1\)
- mean model \(\text f(t) = \mu\)
- The VDS measurements in the entire time series either all equal \(K\) or all equal 0, so there is no trend.
- \(N \ge 3\) and \(N_\text{mid} = 2, 3 \text{ or } 4\)
- linear model \(\text f(t) = \mu + \beta t\)
- \(N_\text{mid} \geq 5\)
- linear model \(\text f(t) = \mu + \beta t\) and smooth model \(\text f(t) = \text s(t)\)
- Smoothers on 2 degrees of freedom (df) are considered when \(5 \leq N_\text{mid} \leq 7\), on 2 and 3 df when \(8 \leq N_\text{mid} \leq 10\) and on 2, 3, and 4 df when \(N_\text{mid} \geq 11\).

Change-point models are also considered provided that the time series starts before 2008 and that \(N \ge 3\) and \(N_\text{mid} > 1\). Each change point model is of the form

\[f(t) = \begin{cases} \mu, & \text{if } t < t_\text{change} \\ \mu + g(t), & \text{if } t \ge t_\text{change} \\ \end{cases}\]

where \(t_\text{change}\) is the change year and \(g(t_\text{change}) = 0\) to ensure \(f(t)\) is continuous. Let \(N_\text{mid}^*\) be the number of ‘middle’ years from \(t_\text{change}\) onwards; i.e. \(|\{t_i : t_i \ge t_\text{change} \text { and } i \ge = i_1 \} |\). Then, similar to above, the form of \(g(t)\) depends on \(N_\text{mid}^*\).

- \(N_\text{mid}^* = 2, 3 \text{ or } 4\)
- linear change-point model \(\text g(t) = \beta (t - t_\text{change})\)
- \(N_\text{mid}^* \geq 5\)
- linear change-point model \(\text g(t) = \beta (t - t_\text{change})\) and smooth change-point model \(\text g(t) = \text s(t)\), with \(s(t_\text{change})\) = 0
- Smoothers on 2 degrees of freedom (df) are considered when \(5 \leq N_\text{mid}^* \leq 7\), on 2 and 3 df when \(8 \leq N_\text{mid}^* \leq 10\) and on 2, 3, and 4 df when \(N_\text{mid}^* \geq 11\).

The change-point models are fitted for each change-year \(t_\text{change}\) = 2004, 2005, 2006, 2007, 2008 in turn, provided that \(t_1 < t_\text{change}\); i.e. the time series started before the change-year.

All the candidate models are fitted by maximum likelihood, with the final model chosen using AICc^{7}. For some time series, there are many candidate models and there is a danger of over-fitting the data. However, this is mitigated somewhat by the fact that the models have been tailored to patterns of change seen in so many time series. It is also preferable to overfit rather than underfit for the purposes of assessing environmental status. Linear or smooth models often overpredict VDS levels in the final monitoring year, so if these are the only models considered, status will appear to be poorer than it should be.

The cut-points \(\theta_k, k = 0, ..., K-1\) determine the probability of being in each VDS class given the underlying level of TBT contamination (represented by \(f(t)\) above). The cut-points can be thought of as measuring the eco-toxicological response of a species to TBT contamination and might reasonably be assumed to be constant over a wide area. The cut-points can therefore be estimated with good precision by fitting a ‘full’ model to the data from many time series collected over a wide area.

Suppose that, for a particular species and area, there are VDS time series at \(M\) stations. With some abuse of notation, let \(y_{mij}\) be the VDS measurement of the \(j\)th female snail in year \(t_{mi}\) from station \(m\). Then the full model is

\[\text {logit} \left( \text {Prob} (y_{mij} \le k) \right) = \mu _{mi} + \theta_k\]

where \(\mu _{mi}\) represents the underlying level of TBT contamination in year \(t_{mi}\) at station \(m\). The species area combinations used for estimating the cut-points are here.

The mean VDS class in year \(t\), denoted \(v(t)\), is

\[v(t) = \sum_{k = 0}^K k \text { Prob} \left ( Y_t = k \right )\]

where \(Y_t\) is a random variable describing the VDS class of individual snails in year \(t\). The probabilities are expressed in terms of \(f(t)\) and the cut-points through the relationships

\[\begin{align} \text { Prob} \left ( Y_t = k \right ) &= \text { Prob} \left ( Y_t \le k \right ) - \text { Prob} \left ( Y_t \le k-1 \right ) \\ &= \frac {\text {exp} (f(t) + \theta_k)} {1 + \text {exp} (f(t) + \theta_k)} - \frac {\text {exp} (f(t) + \theta_{k-1})} {1 + \text {exp} (f(t) + \theta_{k-1})} \end{align}\]

with \(\text {Prob} (Y_t \le K) = 1\) as before. The mean VDS class \(v(t)\) is then estimated by plugging the estimates of \(f(t)\) and the cut-points into these formulae.

Approximate confidence intervals on \(v(t)\) are obtained by simulating the distribution of the estimates of \(f(t)\) and the cut-points and hence the distribution of \(\hat {v(t)}\)^{8}. In particular, an upper one-sided 95% confidence limit on \(v(t)\) is the 95% ordered value of the simulated distribution of \(\hat {v(t)}\).

Environmental status and temporal trends are assessed using the model fitted to the VDS data

Environmental status is assessed by comparing the upper one-sided 95% confidence limit on the mean VDS class in the most recent monitoring year (see previous section) to the available assessment criteria. For example, if the upper confidence limit is below the Background Assessment Concentration (BAC), then the mean VDS class in the most recent monitoring year is significantly below the BAC and VDS levels are said to be ‘at background’.

No formal assessment of status is made when there are only 1 or 2 years of data. However, an ad-hoc assessment is made by computing an upper one-sided 95% confidence limit on the mean VDS class in the final monitoring year from the full model used to estimate the cut-points. This confidence limit is then compared to the assessment criteria.

Temporal trends are assessed for all time series with at least three years of data. When a linear or a linear change-point model has been fitted, the statistical significance of the temporal trend is obtained from an F test^{9} that compares the fits of the linear (change-point) model and the mean model \(f(\text{year}) = \mu\). The summary maps show a downward or upward trend if the trend is significant at the 5% significance level.

When a smooth or a smooth change-point model has been fitted, a plot of the fitted model is needed to understand the overall pattern of change. (This is available on the Raw data with assessment and Assessment pages on the right side of the summary map under Graphics.) The summary map focusses on just one aspect of the change over time: the change in \(f(t)\) in the most recent twenty monitoring years; i.e. between 2001 and 2020 (the assessment only includes data up to 2020). For this, the fitted value of the smoother in 2020 is compared to the fitted value in 2001 using a t-test, with significance assessed at the 5% level^{10}. The correlation between the two fitted values is accounted for by the t-test. If the time series does not extend to 2020, then the fitted value in the last monitoring year is used instead. Similarly, if the time series starts after 2001, the fitted value in the first monitoring year is used.

^{1} Imposex in Buccinum undatum, Neptunea antiqua, Nucella lapillus, Ocenebra erinaceus and Tritia nitida / reticulata is assessed using VDS. Imposex in Littorina littorea is assessed using intersex stage. ↩︎

^{2} \(K\) = 6 for Nucella lapillus and Ocenebra erinaceus and \(K\) = 4 for Buccinum undatum, Littorina littorea, Neptunea antiqua and Tritia nitida / reticulata. ↩︎

^{3} An additional constraint is necessary for identifiability since \(f(t)\) is also in the linear predictor. Typically, one of the intermediate cut-points is set to zero. ↩︎

^{4} Even with multiple time series, there are sometimes very few snails with VDS measurements in the highest class \(K\). This can lead to difficulties estimating the highest cut-point, so the pragmatic decision is taken to combine the upper two classes with e.g. \(K\) reducing from 6 to 5. If there are still few snails in the (new) highest class, the process is repeated with e.g. \(K\) reducing to 4. ↩︎

^{5} The variance of the parameter estimates is obtained from the Hessian matrix in the usual way. If there is any evidence of over-dispersion (see footnote 7), the variance matrix is then multiplied by the estimate of the dispersion parameter. ↩︎

^{6} If all the VDS measurements for a series of years are equal to \(K\), then there is negligible information with which to ‘anchor’ the estimates of \(f(t)\). All we know is that, on the logistic scale, \(f(t)\) could be anywhere between ‘large’ and infinite. ↩︎

^{7} AICc is a model selection criterion that gives greater protection against overfitting than AIC when the sample size is small. The usual definition of AICc is

\[\text{AICc = - 2 log likelihood} + \frac {2 p n} {n - p - 1}\]

where \(n\) is the sample size and \(p\) is the number of parameters in the model. For a VDS time series, the natural definition of the sample size is the number of years of data, \(N\). (One might consider using \(N_\text{mid}\) but things are complicated enough as it is.) Further, \(p\) is the number of parameters associated with \(f(t)\) (with the cut-points ignored). For example, the linear model has \(p\) = 2. However, there is often evidence of over-dispersion and, although AICc is then formally undefined, a sensible adjustment can be made by dividing the log likelihood by an estimate of the dispersion parameter, and extending the second term to account for the additional (dispersion) parameter. This gives

\[\text{AICc = - } \frac {\text {2 log likelihood}} {\phi} + \frac {2 (p + 1) N} {(N - p - 2)}\]

where \(\phi\) is the dispersion parameter. The value of \(\phi\) is common to all the candidate models and is estimated by fitting all the candidate models in turn and comparing each to the fit of a full model. Let \(d_i\) be the deviance (- 2 log likelihood) of candidate model \(i\) and let \(p_i\) be the corresponding number of model parameters. Further, let \(d_\text{full}\) be the deviance of a full model in which \(f(t) = \mu_t\); i.e. there is a separate parameter estimated for each year. Then the dispersion parameter for model \(i\) is estimated to be

\[\phi_i = \text{max} \left(1, \frac {d_i - d_\text{full}} {N - p_i} \right)\]

and the dispersion parameter used in the AICc calculations is \(\phi = \text{min} (\phi_i)\). If \(N \le 4\), then AICc is undefined, and AIC is used instead where

\[\text{AIC = - } \frac {\text {2 log likelihood}} {\phi} + 2p\]

^{8} The estimates of the parameters of \(f(t)\) are assumed to be normally distributed with variance obtained from the Hessian matrix of the likelihood of the individual time series data (and multiplied by the over-dispersion parameter). The estimates of the cut-points are also assumed to be normally distributed with variance obtained from the Hessian matrix of the likelihood of the multiple time series data used to estimate the cut-points. For simplicity, the two sets of estimates are assumed to be independent. Typically, 1000 realisations are simulated. ↩︎

^{9} Let \(d_\text{final}\) be the deviance (-2 log likelihood) of the linear (change-point) model, \(d_\text{mean}\) be the deviance of the mean model, and \(\phi\) be the dispersion parameter. Then \(F = (d_\text{final} - d_\text{mean}) / \phi\) is referred to an F distribution on 1 and \(N - 2\) degrees of freedom. ↩︎

^{10} The t test has \(N - p\) degrees of freedom, where \(p\) is the number of parameters in \(f(t)\). ↩︎