Chapter 4 M1 model specification

This section describes the implementation of the modules in FIMS in milestone 1. For the first milestone, we implemented enough complexity to adequately test a very standard population model. For this reason, we implemented the minimum structure that can run the model described in Li et al. 2021. The FIMS at the end of milestone 1 is an age-structured integrated assessment model with two fleets (one survey, one fishery) and one sex.

4.1 Inherited functors from `TMB`

4.1.1 Atomic functions

Wherever possible, we avoid reinventing atomic functions with extant definitions in TMB. If there is a need for a new atomic function the development team can add it to TMB using the TMB_ATOMIC_VECTOR_FUNCTION() macro following the instructions here. Prototype atomic functions under development for future FIMS milestones are stored in the fims_math.hpp file in the m1-prototypes repository.

4.1.2 Statistical distributions

All of the statistical distributions needed for the first milestone of FIMS are implemented in TMB and need not be replicated. Code can be found here.

Distribution	Name	FIMS wrapper
Normal	dnorm	FIMS code
Multinomial	dmultinom	FIMS code
Lognormal	uses dnorm	FIMS code

4.1.2.1 Normal Distribution

\[f(x) = \frac{1}{\sigma\sqrt{2\pi}}\mathrm{exp}\Bigg(-\frac{(x-\mu)^2}{2\sigma^2} \Bigg),\]

where \(\mu\) is the mean of the distribution and \(\sigma^2\) is the variance.

4.1.2.2 Multinomial Distribution

For \(k\) categories and sample size, \(n\),

\[f(\underline{y}) = \frac{n!}{y_{1}!... y_{k}!}p^{y_{1}}_{1}...p^{y_{k}}_{k},\]

where \(\sum^{k}_{i=1}y_{i} = n\), \(p_{i} > 0\), and \(\sum^{k}_{i=1}p_{i} = 1\).

The mean and variance of \(y_{i}\) are respectively:

\(\mu_{i} = np_{i}\),

\(\sigma^{2}_{i} = np_{i}(1-p_{i})\)

4.1.2.3 Lognormal Distribution

\[f(x) = \frac{1.0}{ x\sigma\sqrt{2\pi} }\mathrm{exp}\Bigg(-\frac{(\mathrm{ln}(x) - \mu)^{2}}{2\sigma^{2}}\Bigg),\]

where \(\mu\) is the mean of the distribution of \(\mathrm{ln(x)}\) and \(\sigma^2\) is the variance of \(\mathrm{ln}(x)\).

4.2 Beverton-Holt recruitment function

For parity with existing stock assessment models, the first recruitment option in FIMS is the steepness parameterization of the Beverton-Holt model (Beverton and Holt, 1957),

\[R_t(S_{t-1}) =\frac{0.8R_0hS_{t-1}}{0.2R_0\phi_0(1-h) + S_{t-1}(h-0.2)}\] where \(R_t\) and \(S_t\) are mean recruitment and spawning biomass at time \(t\), \(h\) is steepness, and \(\phi_0\) is the unfished spawning biomass per recruit. The initial FIMS model implements a static spawning biomass-per-recruit function, with the ability to overload the method in the future to allow for time-variation in spawning biomass per recruit that results from variation in life-history characteristics (e.g., natural mortality, maturity, or weight-at-age). Recruitment deviations (\(r_t\)) are assumed to be normally distributed in log space with standard deviation \(\sigma_R\), \[r_t \sim N(0,\sigma_R^2)\]

Because \(r_t\) are applied as multiplicative, lognormal deviations, predictions of realized recruitment include a term for bias correction (\(\sigma^2_R/2\)). However, true \(r_t\) values are not known, but rather estimated (\(\hat{r}_t\)), and thus the bias correction applies an adjustment factor, \(b_t=\frac{E[SD(\hat{r}_{t})]^2}{\sigma_R^2}\) (Methot and Taylor, 2011). The adjusted bias correction, mean recruitment, and recruitment deviations are then used to compute realized recruitment (\(R^*_t\)), \[R^*_t=R_t\cdot\mathrm{exp}\Bigg(\hat{r}_{t}-b_t\frac{\sigma_R^2}{2}\Bigg)\] The recruitment function should take as input the values of \(S_t\), \(h\), \(R_0\), \(\phi_0\), \(\sigma_R\), and \(\hat{r}_{t}\), and return mean-unbiased (\(R_t\)) and realized (\(R^*_t\)) recruitment.

4.3 Logistic function with extensions

\[y_i=\frac{1}{1+\mathrm{exp}(-s \cdot(x_i-\nu))}\]

Where \(y_i\) is the quantity of interest (proportion mature, selected, etc.), \(x_i\) is the index (can be age or size or any other quantity), \(\nu\) is the median (inflection point), and \(s\) is the slope parameter from an alternative parameterization. Logistic functions for maturity and selectivity should inherit and extend upon the base logistic function implementation.

The parameterization for the double logistic curve is specified as

\[y_i=\frac{1.0}{ 1.0 + \mathrm{exp}(-1.0 \cdot s_1(x_i - \nu_1))} \left(1-\frac{1.0}{ 1.0 + \mathrm{exp}(-1.0 \cdot s_2 (x_i - \nu_2))} \right)\] Where \(s_1\) and and \(\nu_1\) are the slope and median (50%) parameters for the ascending limb of the curve, and \(s_2\) and and \(\nu_2\) are the slope and median parameters for the descending limb of the curve. This is currently only implemented for the selectivity module.

4.4 Catch and fishing mortality

The Baranov catch equation relates catch to instantaneous fishing and natural mortality.

\[ C_{f,a,t}=\frac{F_{f,a,t}}{F_{f,a,t}+M}\Bigg[1-\mathrm{exp}(-F_{f,a,t}-M)\Bigg]N_{a,t}\]

Where \(C_{f,a,t}\) is the catch at age \(a\) at time \(t\) for fleet \(f\), \(F_t\) is instantaneous fishing mortality, \(M\) is assumed constant over ages and time in the minimum viable assessment model, \(N_{a,t}\) is the number of age \(a\) fish at time \(t\).

\[F_{f,a,t}=\sum_{a=0}^A s_{f,a}F_t\]

\(s_{f,a}\) is selectivity at age \(a\) for fleet \(f\). Selectivity-at-age is constant over time.

Catch is in metric tons and survey is in number, so calculating catch weight (\(CW_t\)) is done as follows: \[ CW_t=\sum_{a=0}^A C_{a,t}w_a \]

Survey numbers are calculated as follows

\[I_t=q_t\sum_{a=0}^AN_{a,t}\] Where \(I_t\) is the survey index and \(q_t\) is survey catchability at time \(t\).

4.5 Modeling loops

This tier associates the expected values for each population section associated with a data source to that data source using a likelihood function. These likelihood functions are then combined into an objective function that is passed to TMB.

The population loop is initialized at a user-specified age, time increment, and seasonal structure, rather than assuming ages, years, or seasons follow any pre-defined structure. Population categories will be described flexibly, such that subpopulations such as unique sexes, stocks, species, or areas can be handled identically to reduce duplication. Each subpopulation will have a unique set of attributes assigned to it, such that each subpopulation can share or have a different functional process (e.g., recruitment function, size-at-age) than a different category.

Spawning time and recruitment time are user-specified and can occur more than once per year. For the purposes of replicating model comparison project outputs, in milestone 1, all processes including spawning and recruitment occur January 1, but these should be specified via the spawn_time and recruit_time inputs into FIMS to allow for future flexibility. Spawning and recruitment timing can be input as a scalar or vector to account for multiple options.

Within the population loop, matrices denoting population properties at different partitions (age, season, sex) are translated into a single, dimension-folded index. A lookup table is computed at model start so that the dimension-folded index can be mapped to its corresponding population partition or time partition (e.g., population(sex, area, age, species, time, …)) so the programmer can understand what is happening. The model steps through each specified timestep to match the data to expected values, and population processes occur in the closest specified timestep to the user-input process timing (e.g., recruitment) across a small timestep that is a predefined constant.

4.6 Expected numbers and quantities

The expected values are calculated as follows in the population.hpp file: \[ B_t=\sum_{a=0}^AN_{a,t}w_a\] where \(B_t\) is total biomass in time \(t\), \(N\) is total numbers, \(w_a\) is weight-at-age \(a\) in kilograms.

\[N_t=\sum_{a=0}^AN_{a,t}\] where \(N_at\) is the total number of fish at age \(a\) in time \(t\).

\[UnfishedNAA_{t,0} = R0_{t}\]

Annual unfished numbers at age and unfished spawning biomass are tracked in the model assuming annual recruitment at rzero and only natural mortality. This provides a dynamic reference point that accounts for time varying rzero and M. This does not currently include recruitment deviations. \[UnfishedNAA_{t,0} = R0_{t}\] \[UnfishedNAA_{t,a} = UnfishedNAA_{t-1,a-1}exp(-M_{t-1,a-1})\] for all t>0 and numbers at age at the start of the first year are model parameter inputs.

\[ UnfishedSSB_t=\sum_{a=0}^AUnfishedNAA_{a,t}w_aFracFemale_aFracMature_a\] All spawning stock biomass values are current calculated at January 1 each year. This will be updated in future milestones.

4.7 Initial values

The initial equilibrium recruitment (\(R_{eq}\)) is calculated as follows:

\[R_{eq} = \frac{R_{0}(4h\phi_{F} - (1-h)\phi_{0})}{(5h-1)\phi_{F}} \] where \(\phi_{F}\) is the initial spawning biomass per recruitment given the initial fishing mortality \(F\).

The initial population structure at the start of the first model year is input as an estimated parameter vector of numbers at age. This allows maximum flexibility for the model to estimate non-equilibrium starting conditions. Future milestones could add an option to input a single F value used to calculate an equilibrium starting structure.

4.8 Likelihood calculations

Age composition likelihood links proportions at age from data to model using a multinomial likelihood function. The multinomial and lognormal distributions, including atomic functions are provided within TMB.

Survey index likelihood links estimated CPUE to input data CPUE in biomass using a lognormal distribution. (model.hpp)

Catch index likelihood links estimated catch to input data catch in biomass using a lognormal distribution. (model.hpp)

Age composition likelihoods link catch-at-age to expected catch-at-age using a multinomial distribution.

Recruitment deviations control the difference between expected and realized recruitment, and they follow a lognormal distribution. (recruitment_base.hpp)

4.9 Statistical Inference:

TODO: Add description detailing the statistical inference used in M1