AFSC Case Study Gulf of Alaska Walleye Pollock

The setup

Code
# Names of required packages
packages <- c("dplyr", "tidyr", "ggplot2", "TMB", "reshape2", "here", "remotes", "lubridate")

# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages], repos = "http://cran.us.r-project.org")
}

The downloaded binary packages are in
    /var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T//Rtmp9VrHzZ/downloaded_packages
Code
remotes::install_github("kaskr/TMB_contrib_R/TMBhelper")
RcppParallel (NA    -> 5.1.8     ) [CRAN]
colorspace   (2.1-0 -> 2.1-1     ) [CRAN]
numDeriv     (NA    -> 2016.8-1.1) [CRAN]
ps           (NA    -> 1.7.7     ) [CRAN]
distribut... (NA    -> 0.4.0     ) [CRAN]
tensorA      (NA    -> 0.36.2.1  ) [CRAN]
abind        (NA    -> 1.4-5     ) [CRAN]
backports    (NA    -> 1.5.0     ) [CRAN]
processx     (NA    -> 3.8.4     ) [CRAN]
desc         (NA    -> 1.4.3     ) [CRAN]
callr        (NA    -> 3.7.6     ) [CRAN]
posterior    (NA    -> 1.6.0     ) [CRAN]
matrixStats  (NA    -> 1.3.0     ) [CRAN]
checkmate    (NA    -> 2.3.1     ) [CRAN]
BH           (NA    -> 1.84.0-0  ) [CRAN]
QuickJSR     (NA    -> 1.3.1     ) [CRAN]
pkgbuild     (NA    -> 1.4.4     ) [CRAN]
loo          (NA    -> 2.8.0     ) [CRAN]
gridExtra    (NA    -> 2.3       ) [CRAN]
inline       (NA    -> 0.3.19    ) [CRAN]
StanHeaders  (NA    -> 2.32.10   ) [CRAN]
rstan        (NA    -> 2.32.6    ) [CRAN]
tmbstan      (NA    -> 1.0.91    ) [CRAN]

The downloaded binary packages are in
    /var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T//Rtmp9VrHzZ/downloaded_packages
── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘/private/var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T/Rtmp9VrHzZ/remotes119e6ee04f16/kaskr-TMB_contrib_R-d275e52/TMBhelper/DESCRIPTION’ ... OK
* preparing ‘TMBhelper’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘TMBhelper_1.4.0.tar.gz’
Code
remotes::install_github("NOAA-FIMS/FIMS")
colorspace (2.1-0 -> 2.1-1) [CRAN]

The downloaded binary packages are in
    /var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T//Rtmp9VrHzZ/downloaded_packages
── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘/private/var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T/Rtmp9VrHzZ/remotes119e5bff994d/NOAA-FIMS-FIMS-081972c/DESCRIPTION’ ... OK
* preparing ‘FIMS’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘FIMS_0.2.0.0.tar.gz’
Code
remotes::install_github("r4ss/r4ss")
systemfonts (NA    -> 1.1.0   ) [CRAN]
colorspace  (2.1-0 -> 2.1-1   ) [CRAN]
yaml        (2.3.9 -> 2.3.10  ) [CRAN]
sys         (NA    -> 3.4.2   ) [CRAN]
askpass     (NA    -> 1.2.0   ) [CRAN]
openssl     (NA    -> 2.2.0   ) [CRAN]
curl        (NA    -> 5.2.1   ) [CRAN]
parallelly  (NA    -> 1.37.1  ) [CRAN]
listenv     (NA    -> 0.9.1   ) [CRAN]
globals     (NA    -> 0.16.3  ) [CRAN]
svglite     (NA    -> 2.1.3   ) [CRAN]
rstudioapi  (NA    -> 0.16.0  ) [CRAN]
xml2        (NA    -> 1.3.6   ) [CRAN]
ini         (NA    -> 0.3.1   ) [CRAN]
httr2       (NA    -> 1.0.2   ) [CRAN]
gitcreds    (NA    -> 0.1.2   ) [CRAN]
future      (NA    -> 1.33.2  ) [CRAN]
kableExtra  (NA    -> 1.4.0   ) [CRAN]
gh          (NA    -> 1.4.1   ) [CRAN]
furrr       (NA    -> 0.3.1   ) [CRAN]
forcats     (NA    -> 1.0.0   ) [CRAN]
corpcor     (NA    -> 1.6.10  ) [CRAN]
coda        (NA    -> 0.19-4.1) [CRAN]

The downloaded binary packages are in
    /var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T//Rtmp9VrHzZ/downloaded_packages
── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘/private/var/folders/hn/5bx1f4_d4ds5vhwhkxc7vdcr0000gn/T/Rtmp9VrHzZ/remotes119e7a917174/r4ss-r4ss-5be028c/DESCRIPTION’ ... OK
* preparing ‘r4ss’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘r4ss_1.49.3.tar.gz’
Code
# Load packages
invisible(lapply(packages, library, character.only = TRUE))

library(FIMS)
library(TMBhelper)

R_version <- version$version.string
TMB_version <- packageDescription("TMB")$Version
FIMS_commit <- substr(packageDescription("FIMS")$GithubSHA1, 1, 7)
Code
theme_set(theme_bw())


# R_version <- version$version.string
# TMB_version <- packageDescription("TMB")$Version
# FIMS_commit <- substr(packageDescription("FIMS")$GithubSHA1, 1, 7)
  • R version: R version 4.4.1 (2024-06-14)
  • TMB version: 1.9.14
  • FIMS commit: 081972c
  • Stock name: Gulf of Alaska (GOA) Walleye Pollock
  • Region: AFSC
  • Analyst: Cole Monnahan

Simplifications to the original assessment

The model presented in this case study was changed substantially from the operational version and should not be considered reflective of the pollock stock. This is intended as a demonstration and nothing more.

To get the opertional model to more closely match FIMS I:

  • Droped surveys 1, 4, and 5
  • Removed ageing error
  • Removed length compositions
  • Removed bias correction in log-normal index likelihoods
  • Simplified catchability of survey 3 to be constant in time (removed random walk)
  • Updated maturity to be parametric rather than empirical
  • Used constant weight at age for all sources: spawning, fishery, surveys, and biomass calculations. The same matrix was used throughout.
  • Changee timing to be Jan 1 for spawning and all surveys
  • Removed prior on catchability for survey 2
  • Removed time-varying fisheries selectivity (constant double logistic)
  • Took off normalization of selectivity
  • Removed age accumulation for fishery age compositions

Script to prepare data for building FIMS object

Code
## define the dimensions and global variables
years <- 1970:2023
nyears <- length(years)
nseasons <- 1
nages <- 10
ages <- 1:nages
## nfleets <- 1
## This will fit the models bridging to FIMS (simplifying)
## source("fit_bridge_models.R")
## compare changes to model
pkfitfinal <- readRDS("data_files/pkfitfinal.RDS")
pkfit0 <- readRDS("data_files/pkfit0.RDS")
parfinal <- pkfitfinal$obj$env$parList()
pkinput0 <- readRDS('data_files/pkinput0.RDS')
fimsdat <- pkdat0 <- pkinput0$dat
pkinput <- readRDS('data_files/pkinput.RDS')

Run FIMS model

Code
## set up FIMS data objects
clear()
clear_logs()
estimate_fish_selex <- TRUE
estimate_survey_selex <- TRUE
estimate_q2 <- TRUE
estimate_q3 <- TRUE
estimate_q6 <- TRUE
estimate_F <- TRUE
estimate_recdevs <- TRUE
source("R/pk_prepare_FIMS_inputs.R")
## make FIMS model
success <- CreateTMBModel()
parameters <- list(p = get_fixed())
obj <- MakeADFun(data = list(), parameters, DLL = "FIMS", silent = TRUE)
## report values for the two models
rep0 <- pkfitfinal$rep
rep1 <- obj$report() # FIMS initial values
## try fitting the model
opt <- TMBhelper::fit_tmb(obj, getsd=FALSE, newtonsteps=0, control=list(trace=100))
## opt <- with(obj, nlminb(start=par, objective=fn, gradient=gr))
max(abs(obj$gr())) # from Cole, can use TMBhelper::fit_tmb to get val to <1e-10
rep2 <- obj$report(obj$env$last.par.best) ## FIMS after estimation

## Output plotting
out1 <- get_long_outputs(rep1, rep0) %>%
  mutate(platform=ifelse(platform=='FIMS', 'FIMS init', 'TMB'))
out2 <- get_long_outputs(rep2, rep0) %>% filter(platform=='FIMS') %>%
  mutate(platform='FIMS opt')
out <- rbind(out1,out2)
g <- ggplot(out, aes(year, value, color=platform)) + geom_line() +
  facet_wrap('name', scales='free') + ylim(0,NA) +
  labs(x=NULL, y=NULL)
ggsave('figures/AFSC_PK_ts_comparison.png', g, width=9, height=5, units='in')
g <- ggplot(filter(out, platform!='TMB'), aes(year, relerror, color=platform)) + geom_line() +
  facet_wrap('name', scales='free') +
  labs(x=NULL, y='Relative difference') + coord_cartesian(ylim=c(-.5,.5))
ggsave('figures/AFSC_PK_ts_comparison_relerror.png', g, width=9, height=5, units='in')

## Quick check on age comp fits
p1 <- get_acomp_fits(rep0, rep1, rep2, fleet=1, years=pkdat0$fshyrs)
g <- ggplot(p1, aes(age, paa, color=platform)) + facet_wrap('year') + geom_line()
ggsave('figures/AFSC_PK_age_comp_fits_1.png', g, width=9, height=8, units='in')
p2 <- get_acomp_fits(rep0, rep1, rep2, fleet=2, years=pkdat0$srv_acyrs2)
g <- ggplot(p2, aes(age, paa, color=platform)) + facet_wrap('year') + geom_line()
ggsave('figures/AFSC_PK_age_comp_fits_2.png', g, width=9, height=8, units='in')
## p3 <- get_acomp_fits(rep0, rep1, rep2, fleet=3, years=pkdat0$srv_acyrs3)
## g <- ggplot(p3, aes(age, paa, color=platform)) + facet_wrap('year') + geom_line()
## p6 <- get_acomp_fits(rep0, rep1, rep2, fleet=4, years=pkdat0$srv_acyrs6)
## g <- ggplot(p6, aes(age, paa, color=platform)) + facet_wrap('year') + geom_line()

## index fits
addsegs <- function(yrs, obs, CV){
  getlwr <- function(obs, CV) qlnorm(p=.025, meanlog=log(obs), sdlog=sqrt(log(1+CV^2)))
  getupr <- function(obs, CV) qlnorm(p=.975, meanlog=log(obs), sdlog=sqrt(log(1+CV^2)))
  segments(yrs, y0=getlwr(obs,CV), y1=getupr(obs,CV))
  points(yrs, obs, pch=22, bg='white')
}
png('figures/AFSC_PK_index_fits.png', res=300, width=6, height=7, units='in')
par(mfrow=c(3,1), mar=c(3,3,.5,.5), mgp=c(1.5,.5,0), tck=-0.02)
plot(years, rep0$Eindxsurv2, type='l',
     ylim=c(0,2), lwd=5.5,
      xlab=NA, ylab='Biomass (million t)')
x1 <- out1 %>% filter(name=='Index2' & platform=='FIMS init')
x2 <- out2 %>% filter(name=='Index2' & platform=='FIMS opt')
lines(years,x1$value, col=2, lwd=1.5)
lines(years,x2$value, col=3, lwd=1.5)
addsegs(yrs=pkdat0$srvyrs2, obs=pkdat0$indxsurv2, CV=pkdat0$indxsurv_log_sd2)
legend('topright', legend=c('TMB', 'FIMS init', 'FIMS opt'), lty=1, col=1:3)
mtext('Survey 2', line=-1.5)
plot(years, rep0$Eindxsurv3, type='l',
     ylim=c(0,.6), lwd=5.5,
      xlab=NA, ylab='Biomass (million t)')
x1 <- out1 %>% filter(name=='Index3' & platform=='FIMS init')
x2 <- out2 %>% filter(name=='Index3' & platform=='FIMS opt')
lines(years,x1$value, col=2, lwd=1.5)
lines(years,x2$value, col=3, lwd=1.5)
addsegs(yrs=pkdat0$srvyrs3, obs=pkdat0$indxsurv3, CV=pkdat0$indxsurv_log_sd3)
mtext('Survey 3', line=-1.5)
legend('topright', legend=c('TMB', 'FIMS init', 'FIMS opt'), lty=1, col=1:3)
plot(years, rep0$Eindxsurv6, type='l',
     ylim=c(0,2.6), lwd=5.5,
      xlab=NA, ylab='Biomass (million t)')
x1 <- out1 %>% filter(name=='Index6' & platform=='FIMS init')
x2 <- out2 %>% filter(name=='Index6' & platform=='FIMS opt')
lines(years,x1$value, col=2, lwd=1.5)
lines(years,x2$value, col=3, lwd=1.5)
addsegs(yrs=pkdat0$srvyrs6, obs=pkdat0$indxsurv6, CV=pkdat0$indxsurv_log_sd6)
mtext('Survey 6', line=-1.5)
legend('topright', legend=c('TMB', 'FIMS init', 'FIMS opt'), lty=1, col=1:3)
dev.off()

Comparison figures for basic model

Time Series Time Series (relative error) Indices Fishery Age Composition Fits Survey 2 Age Composition Fits

Extra analyses

Two extra analyses are demonstrated. First is a likelihood profile over lnR0, showing component contributions and testing for data conflict (a Piner plot). The second is to run the model through the ‘Stan’ software using the ‘tmbstan’ R package. This samples from the posterior, which are put back into the model to get the posterior distribution for spawning stock biomass. Given its long run time the results are saved to a file and read in for post-hoc plotting.

Code
## Try a likelihood profile
i <- 68 # this will break if model is changed at all
map <- parameters
map$p[i] <- NA # map off R0 specified below
map$p <- as.factor(map$p)
xseq <- as.numeric(c(opt$par[i],  seq(22,24, len=30)))
res <- list()
for(j in 1:length(xseq)){
  print(j)
  parameters$p[i] <- xseq[j]
  obj2 <- MakeADFun(data = list(), parameters, DLL = "FIMS", silent = TRUE, map=map)
  opt2 <- TMBhelper::fit_tmb(obj2, getsd=FALSE, newtonsteps=0, control=list(trace=0))
  out <- obj2$report(obj2$env$last.par.best)
  res[[j]] <- data.frame(j=j, lnR0=xseq[j], total=out$jnll, index=out$index_nll,
                         age=out$age_comp_nll,recruit=out$rec_nll, maxgrad=max(abs(obj$gr())))
}
res <- bind_rows(res) %>%
  pivot_longer( cols=c(total, index, age, recruit)) %>%
  group_by(name) %>% mutate(deltaNLL=value-min(value))
g <- ggplot(res, aes(lnR0, deltaNLL, color=name)) + geom_line()
g <- g+ geom_point(data=filter(res, deltaNLL==0), size=2) +
  labs(y='Delta NLL', color='NLL component')
ggsave('figures/AFSC_PK_like_profile_R0.png', g, width=7, height=3.5)



## ## Try Bayesian
## library(tmbstan)
## library(shinystan)
## ## Some paraemters wandering off to Inf so fix those (need
## ## priors). Needs a ton of work but is proof of concept. Major
## ## problem is parallel fails.
## map <- parameters
## ## parameters$p[65:66]
## map$p[65:66] <- NA # map off R0 specified below
## map$p <- as.factor(map$p)
## obj3 <- MakeADFun(data = list(), parameters, DLL = "FIMS", silent=TRUE, map=map)
## opt3 <- TMBhelper::fit_tmb(obj3, getsd=FALSE, newtonsteps=0, control=list(trace=0))
## ## Fails when trying to do this in parallel unfortunately
## fit <- tmbstan(obj3, chains=1, cores=1, open_progress=FALSE,
##                init='last.par.best', control=list(max_treedepth=10))
## launch_shinystan(fit)
## df <- as.data.frame(fit)
## df <- df[,-ncol(df)] # drop lp__
## ## for each posterior draw, report to get SSB
## postreps <- list()
## for(ii in 1:nrow(df)){
##   if(ii %% 10==0) print(ii)
##   postreps[[ii]] <- obj3$rep(df[ii,])
## }
## ssbpost <- lapply(postreps, function(x) data.frame(year=years, ssb=x$ssb[[1]][-55]))%>%
##   bind_rows %>% mutate(rep=rep(1:nrow(df), each=54))
## saveRDS(ssbpost, file='pk_SSB_posteriors.RDS')
ssbpost <- readRDS('data_files/pk_pollock_SSB_posteriors.RDS')
g <- ggplot(ssbpost, aes(year, ssb/1e9, group=rep)) + geom_line(alpha=.1) +
  ylim(0,NA) + labs(x=NULL, y='SSB (M t)', title='Posterior demo (unconverged!)')
ggsave('figures/AFSC_PK_ssb_posterior.png', g, width=7, height=4, units='in')

Plots for extra analyses

Likelihood Profile SSB Posterior

Comparison table

The likelihood components from the TMB model do not include constants and thus are not directly comparable. To be fixed later. Relative differences between the modified TMB model and FIMS implementation are given in the figure above.

What was your experience using FIMS? What could we do to improve usability?

To do

List any issues that you ran into or found

  • Output more derived quantities like selectivity, maturity, etc.
  • NLL components are not separated by fleet and need to be. So age comp NLL for fleets 1 and 2 need to be separate to make, e.g., the likelihood profile plot above.
  • Need more ADREPORTed values like SSB

What features are most important to add based on this case study?

  • More sophisticated control over selectivity so that ages 1 and 2 can be zeroed out for a double-logistic form, overriding the selectivity curve.
Code
# Clear C++ objects from memory
clear()
NULL