NEFSC Case Study Southern New England-Mid Atlantic Yellowtail Flounder

The setup

Code
# Names of required packages
packages <- c(
  "dplyr",
  "ggridges",
  "ggplot2",
  "gt",
  "here",
  "lubridate",
  "remotes",
  "reshape2",
  "shinystan",
  "tidybayes",
  "tidyr",
  "TMB"
)

# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages], repos = "http://cran.us.r-project.org")
}

# the Bayesian case study requires these extra packages
if (!"StanEstimators" %in% rownames(installed.packages())) {
  install.packages("StanEstimators", repos = c("https://andrjohns.r-universe.dev", "https://cloud.r-project.org"))
}

if (!"adnuts" %in% rownames(installed.packages())) {
  remotes::install_github("Cole-Monnahan-NOAA/adnuts")
}

# SS3 case studies require r4ss
if (!"r4ss" %in% rownames(installed.packages())) {
  remotes::install_github("r4ss/r4ss")
}

# SS3 case studies require r4ss
if (!"stockplotr" %in% rownames(installed.packages())) {
  remotes::install_github("nmfs-ost/stockplotr")
}

# Install FIMS: main branch version if on main, dev version on any other branch
remotes::install_github("NOAA-FIMS/FIMS", ref = "dev-hake")

# Load packages
invisible(lapply(packages, library, character.only = TRUE))

library(FIMS)
library(adnuts)
R_version <- version$version.string
TMB_version <- packageDescription("TMB")$Version
FIMS_commit <- substr(packageDescription("FIMS")$GithubSHA1, 1, 7)
ggplot2::theme_set(ggplot2::theme_bw())

data_directory <- "data_files"
function_directory <- "R"
purrr::walk(
    dir(function_directory, full.names = TRUE),
  source
)
Code
common_name <- "Southern New England-Mid Atlantic Yellowtail Flounder"
  • R version: R version 4.5.2 (2025-10-31)
  • TMB version: 1.9.19
  • FIMS commit: 14916c8
  • Stock name: Southern New England-Mid Atlantic Yellowtail Flounder
  • Region: NEFSC
  • Analyst: Chris Legault

Simplifications to the original assessment

The model presented in this case study was changed substantially from the operational version and should not be considered reflective of the Southern New England-Mid Atlantic Yellowtail Flounder stock. These results are intended to be a demonstration and nothing more.

To get the operational model to more closely match a FIMS model the following changes were made:

  • End year 2021 to 2019 due to missing 2020 survey
  • 5 indices to 2
  • Filled a survey missing year for one index (see below)
  • Varying weight at age to constant over time
  • Different weight at age matrices for catch, SSB, Jan-1 to same for all 3
  • SSB calculated in April to Jan-1
  • Aggregate index in numbers to weight
  • Fishery 2 selectivity blocks to 1
  • Catch and Index selectivity at age to logistic
  • Index timing from April to Jan-1
  • Varying Index CV to constant over time
  • Varying Catch ESS to constant over time
  • SR unexploited scaler SSB to recruitment

How I simplified my assessment:

To fill the missing year of survey data, I first ran the model in ASAP with the missing data treated as missing. I then filled the missing data with the expected value of the survey, both in aggregate and for catch at age in place of the missing data.

The varying weights at age, index CVs, and ESS values were replaced by the time series mean in all years.

The catch weight at age matrix was used as the weight at age for all sources.

Setting up the data

Code
# clear memory
FIMS::clear()

# read the ASAP rdat files
rdat <- dget(file.path(data_directory, "NEFSC_YT_SIMPLIFIED.RDAT")) # to be used in FIMS, lots of modifications from original
orig <- dget(file.path(data_directory, "NEFSC_YT_ORIGINAL.RDAT"))   # where started before modifications for use in FIMS

# function to create equivalent of data_mile1, basic catch and survey data
# need to think about how to deal with multiple fleets and indices - only use 1 of each for now
get_asap_data <- function(rdat) {
  res <- data.frame(type = character(),
                name = character(),
                age = integer(),
                timing = double(),
                value = double(),
                unit = character(),
                uncertainty = double())
  
  landings <- data.frame(type = "landings",
                     name = "fleet1",
                     age = NA,
                     timing = seq(rdat$parms$styr, rdat$parms$endyr),
                     value = as.numeric(rdat$catch.obs[1,]),
                     unit = "mt",
                     # This was
                     # rdat$control.parms$catch.tot.cv[,1]
                     # but it was just reset to the following later in the parameters so using internal functionality to set it now
                     # uncertainty = rep(log(sqrt(log(as.numeric(mean(rdat$control.parms$catch.tot.cv[,1], na.rm=TRUE)^2) + 1))), rdat[["parms"]][["nyears"]])
                     uncertainty = rep(sqrt(log(as.numeric(mean(rdat$control.parms$catch.tot.cv[,1], na.rm=TRUE)^2) + 1)), rdat[["parms"]][["nyears"]])
  )
  
  # loop over all indices
  for (i in 1:rdat$parms$nindices){
    index <- data.frame(type = "index",
                        name = paste0("survey", i),
                        age = NA,
                        timing = seq(rdat$parms$styr, rdat$parms$endyr),
                        value = as.numeric(rdat$index.obs[[i]]),
                        unit = "",
                        # uncertainty = rep(log(sqrt(log(as.numeric(mean(rdat$index.cv[[i]], na.rm=TRUE)^2 + 1)))), rdat[["parms"]][["nyears"]]))
                        uncertainty = rep(sqrt(log(as.numeric(mean(rdat$index.cv[[i]], na.rm=TRUE)^2 + 1))), rdat[["parms"]][["nyears"]]))
    if (i == 1){
      allinds <- index
    }else{
      allinds <- rbind(allinds, index)
    }
  }
  
  catchage <- data.frame(type = "age_comp",
                     name = "fleet1",
                     age = rep(seq(1,rdat$parms$nages), rdat$parms$nyears),
                     timing = rep(seq(rdat$parms$styr, rdat$parms$endyr), each=rdat$parms$nages),
                     value = as.numeric(t(rdat$catch.comp.mats$catch.fleet1.ob)),
                     unit = "",
                     uncertainty = rep(rdat$fleet.catch.Neff.init[1,], each=rdat$parms$nages))
  
  # loop over all indices
  for (i in 1:rdat$parms$nindices){
    indexage <- data.frame(type = "age_comp",
                           name = paste0("survey", i),
                           age = rep(seq(1,rdat$parms$nages), rdat$parms$nyears),
                           timing = rep(seq(rdat$parms$styr, rdat$parms$endyr), each=rdat$parms$nages),
                           value = as.numeric(t(rdat$index.comp.mats[[i*2-1]])),
                           unit = "",
                           uncertainty = rep(rdat$index.Neff.init[i,], each=rdat$parms$nages))
    if (i == 1){
      allindsage <- indexage
    }else{
      allindsage <- rbind(allindsage, indexage)
    }
  }
  
  # Weight at age is in rdat$WAA.mats$WAA.catch.all but for yellowtail the data
  # is time-invariant. Could just use the first row but the code below shows how
  # to use values for every year/age combination.
  weight_at_age <- data.frame(
    type = "weight-at-age",
    name = "fleet1",
    age = seq(rdat[["parms"]][["nages"]]),
    timing = rep(
      seq(rdat[["parms"]][["styr"]], rdat[["parms"]][["endyr"]]),
      each = rdat[["parms"]][["nages"]]
    ),
    value = c(t(rdat[["WAA.mats"]][["WAA.catch.all"]])),
    unit = "mt",
    uncertainty = NA
  )

  res <- rbind(res, landings, allinds, catchage, allindsage, weight_at_age)
  return(res)
}

# set up FIMS data objects
data_4_model <- FIMS::FIMSFrame(get_asap_data(rdat))
# define the dimensions
ages <- get_ages(data_4_model)
n_ages <- get_n_ages(data_4_model)
years <- seq(get_start_year(data_4_model), get_end_year(data_4_model))
n_years <- get_n_years(data_4_model)

Run FIMS model

Code
test <- create_default_configurations(data = data_4_model) |>
  create_default_parameters(data = data_4_model) |>
  tidyr::unnest(data) |>
  # ASAP assumes Fmult devs = 0
  dplyr::rows_update(
    tibble::tibble(
      fleet_name = "fleet1",
      label = "log_Fmort",
      time = years,
      value = log(rep(rdat$initial.guesses$Fmult.year1.init[1], n_years))
    ),
    by = c("fleet_name", "label", "time")
  ) |>
  dplyr::rows_update(
    tibble::tibble(
      fleet_name = c("survey1", "survey2"),
      label = "log_q",
      value = log(rdat$initial.guesses$q.year1.init)
    ),
    by = c("fleet_name", "label")
  ) |>
  dplyr::rows_update(
    tibble::tibble(
      label = "log_devs",
      time = years[-1],
      # Unrealistically setting these from the ASAP run
      value = rdat$SR.resids$logR.dev[-1],
      estimation_type = "fixed_effects"
    ),
    by = c("label", "time")
  ) |>
  dplyr::rows_update(
    tibble::tibble(
      # ASAP can enter either R0 or SB0, use R0 in input file
      label = c("log_rzero", "logit_steep"),
      value = c(
        log(rdat$initial.guesses$SR.inits$SR.scaler.init),
        FIMS::logit(0.2, 1.0, rdat$initial.guesses$SR.inits$SR_steepness.init)
      ),
      estimation_type = c("fixed_effects", "constant")
    ),
    by = c("label")
  ) |>
  # Maturity
  # NOTE, to match FIMS for a protogynous stock,
  # maturity values were obtained by fitting a logistic fcn to the age vector,
  # mat.female*prop.female + mat.male*prop.male
  # assuming an all female population 
  dplyr::rows_update(
    tibble::tibble(
      module_name = "Maturity",
      label = c("inflection_point", "slope"),
      value = c(1.8, 4)
    ),
    by = c("module_name", "label")
  ) |>
  dplyr::rows_update(
    tibble::tibble(
      module_name = "Population",
      label = "log_init_naa",
      age = ages,
      value = log(rdat$N.age[1, ]),
      # TODO: Turn these on later
      estimation_type = "constant"
    ),
    by = c("module_name", "label", "age")
  ) |>
  dplyr::rows_update(
    tibble::tibble(
      module_name = "Population",
      label = rep("log_M", n_years * n_ages),
      time = rep(years, each = n_ages),
      age = rep(ages, n_years),
      # M is vector of age1 M X nyrs then age2 M X nyrs
      value = log(as.numeric(t(rdat$M.age)))
    ),
    by = c("module_name", "label", "age", "time"),
    # TODO: fix this
    unmatched = "ignore"
  ) |>
    # Update log_sd for log_devs in the Recruitment module
  dplyr::rows_update(
    tibble::tibble(
      module_name = "Recruitment",
      label = "log_sd",
      # ASAP takes one value per year but that value is often just repeated
      value = mean(rdat$control.parms$recruit.cv)
    ),
    by = c("module_name", "label")
  )

# Put it all together, creating the FIMS model and making the TMB fcn
# Run the model without optimization to help ensure a viable model
test_fit <- test |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = FALSE)
# Run the  model with optimization
fit <- test |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = TRUE)
✔ Starting optimization ...
ℹ Restarting optimizer 3 times to improve gradient.
ℹ Maximum gradient went from 0.04181 to 0.00057 after 3 steps.
✔ Finished optimization
✔ Finished sdreport
ℹ FIMS model version: 0.8.1
ℹ Total run time was 2.2977 seconds
ℹ Number of parameters: fixed_effects=102, random_effects=0, and total=102
ℹ Maximum gradient= 0.00057
ℹ Negative log likelihood (NLL):
• Marginal NLL= 2540.13308
• Total NLL= 2540.13308
ℹ Terminal SB= 0
Code
output <- get_estimates(fit)

# eventually change to allow multiple fishing fleets similar to multiple indices - only using 1 fishing fleet for now
# NOTE: FIMS assumes SSB calculated at the start of the year, so need to adjust ASAP to do so as well for now, need to make timing of SSB calculation part of FIMS later
# NOTE: for now tricking FIMS into thinking age 0 is age 1, so need to adjust A50 for maturity because FIMS calculations use ages 0-5+ instead of 1-6

sdr <- get_sdreport(fit)
sdr_fixed <- summary(sdr, "fixed")
report <- get_report(fit)

Plotting

Code
mycols <- c("FIMS" = "blue", "ASAP" = "red", "ASAP_orig" = "darkgreen")

for (i in 1:rdat$parms$nindices) {
  index_results <- data.frame(
    survey = i,
    year = years,
    observed = rdat$index.obs[[i]],
    FIMS = report$index_expected[[rdat$parms$nfleet+i]],
    ASAP = rdat$index.pred[[i]]
  )
  if (i==1){
    allinds_results <- index_results
  }else{
    allinds_results <- rbind(allinds_results, index_results)
  }
}
#print(allinds_results)

comp_index <- ggplot2::ggplot(allinds_results, ggplot2::aes(x = year, y = observed)) +
  ggplot2::geom_point() +
  ggplot2::geom_line(ggplot2::aes(x = year, y = FIMS), color = "blue") +
  ggplot2::geom_line(ggplot2::aes(x = year, y = ASAP), color = "red") +
  ggplot2::facet_wrap(~survey, scales = "free_y", nrow = 2) +
  ggplot2::xlab("Year") +
  ggplot2::ylab("Index") +
  ggplot2::ggtitle("Blue=FIMS, Red=ASAP") +
  ggplot2::theme_bw()
#print(comp_index)

catch_results <- data.frame(
  observed = get_data(data_4_model) |>
    dplyr::filter(type == "index", name == "survey1") |>
    dplyr::pull(value),
  FIMS = report$landings_expected[[1]],
  ASAP = as.numeric(rdat$catch.pred[1,])
)
#print(catch_results)

comp_catch <- ggplot2::ggplot(catch_results, ggplot2::aes(x = years, y = observed)) +
  ggplot2::geom_point() +
  ggplot2::xlab("Year") +
  ggplot2::ylab("Catch (mt)") +
  ggplot2::geom_line(ggplot2::aes(x = years, y = FIMS), color = "blue") +
  ggplot2::geom_line(ggplot2::aes(x = years, y = ASAP), color = "red") +
  ggplot2::ggtitle("Blue=FIMS, Red=ASAP") +
  ggplot2::theme_bw()
#print(comp_catch)

pop_results <- data.frame(
  Year = c(years, max(years)+1, years, years, years, years, max(years)+1, years),
  Metric = c(rep("SSB", 2*n_years+1), rep("F_mort", 2*n_years), rep("Recruitment", 2*n_years+1)),
  Model = c(rep("FIMS", n_years+1), rep("ASAP", n_years), rep(c("FIMS", "ASAP"), each=n_years), 
             rep("FIMS", n_years+1), rep("ASAP", n_years)),
  Value = c(
    report$spawning_biomass[[1]],
    rdat$SSB,
    output |>
      dplyr::filter(label == "log_Fmort", module_id == 1) |>
      dplyr::pull(estimated) |>
      exp(),
    rdat$F.report,
    report$expected_recruitment[[1]],
    as.numeric(rdat$N.age[,1])
  )
)
#print(pop_results)

# ggplot(filter(pop_results, Year <=2019), aes(x=Year, y=Value, color=Model)) +
#   geom_line() +
#   facet_wrap(~Metric, ncol=1, scales = "free_y") +
#   theme_bw() +
#   scale_color_manual(values = mycols)

orig_years <- seq(orig$parms$styr, orig$parms$endyr)
orig_pop_results <- data.frame(
  Year = rep(orig_years, 3),
  Metric = rep(c("SSB", "F_mort", "Recruitment"), each = length(orig_years)),
  Model = "ASAP_orig",
  Value = c(orig$SSB, orig$F.report, as.numeric(orig$N.age[,1]))
)

pop_results_3 <- rbind(pop_results, orig_pop_results)
#print(pop_results_3)

# ggplot(filter(pop_results_3, Year <=2019), aes(x=Year, y=Value, color=Model)) +
#   geom_line() +
#   facet_wrap(~Metric, ncol=1, scales = "free_y") +
#   theme_bw() +
#   scale_color_manual(values = mycols)

comp_FRSSB3 <- ggplot2::ggplot(pop_results_3, ggplot2::aes(x=Year, y=Value, color=Model)) +
  ggplot2::geom_line() +
  ggplot2::facet_wrap(~Metric, ncol=1, scales = "free_y") +
  ggplot2::theme_bw() +
  ggplot2::scale_color_manual(values = mycols)
#print(comp_FRSSB3)

FIMS_naa_results <- data.frame(
  Year = rep(c(years, max(years)+1), each = n_ages),
  Age = rep(ages, n_years+1),
  Metric = "NAA",
  Model = "FIMS",
  Value = report$numbers_at_age[[1]]
)

ASAP_naa_results <- data.frame(
  Year = rep(years, each = n_ages),
  Age = rep(ages, n_years),
  Metric = "NAA",
  Model = "ASAP",
  Value = as.numeric(t(rdat$N.age))
)

orig_naa_results <- data.frame(
  Year = rep(orig_years, each = n_ages),
  Age = rep(ages, length(orig_years)),
  Metric = "NAA",
  Model = "ASAP_orig",
  Value = as.numeric(t(orig$N.age))
)
naa_results <- rbind(FIMS_naa_results, ASAP_naa_results, orig_naa_results)
#print(naa_results)

# ggplot(filter(naa_results, Year <= 2019), aes(x=Year, y=Value, color=Model)) +
#   geom_line() +
#   facet_wrap(~Age, ncol=1, scales = "free_y") +
#   ylab("NAA") +
#   theme_bw() +
#   scale_color_manual(values = mycols)

comp_naa2 <- ggplot2::ggplot(
  dplyr::filter(naa_results, Year <= 2019, Model %in% c("ASAP", "FIMS")), ggplot2::aes(x=Year, y=Value, color=Model)) +
  ggplot2::geom_line() +
  ggplot2::facet_wrap(~Age, ncol=1, scales = "free_y") +
  ggplot2::ylab("NAA") +
  ggplot2::theme_bw() +
  ggplot2::scale_color_manual(values = mycols)
#print(comp_naa2)

# ggplot(filter(naa_results, Year == 1973, Model %in% c("ASAP", "FIMS")), aes(x=Age, y=Value, color=Model)) +
#   geom_line() +
#   ylab("NAA in Year 1") +
#   theme_bw() +
#   scale_color_manual(values = mycols)


saveplots <- TRUE
if(saveplots){
  ggplot2::ggsave(filename = "figures/NEFSC_YT_compare_index.png", plot = comp_index, width = 4, height = 4, units = "in")
  ggplot2::ggsave(filename = "figures/NEFSC_YT_compare_catch.png", plot = comp_catch, width = 4, height = 4, units = "in")
  ggplot2::ggsave(filename = "figures/NEFSC_YT_compare_FRSSB3.png", plot = comp_FRSSB3, width = 5, height = 6.5, units = "in")
  ggplot2::ggsave(filename = "figures/NEFSC_YT_compare_NAA2.png", plot = comp_naa2, width = 5, height = 6.5, units = "in")
}

Comparison figures

Catch Indices FRSSB NAA

Comparison table

The likelihood components from FIMS and ASAP for the same data are shown in the table below. Note that the ASAP file had to turn on the use likelihood constants option to enable this comparison (this option should not be used when recruitment deviations are estimated).

Code
nll_components <- report[["nll_components"]]
jnlltab <- data.frame(
  Component=c("Total","Index","Age Comp", "Rec"),
  FIMS = c(
    report[["jnll"]],
    sum(nll_components[seq(2, length(nll_components), by = 2)]),
    sum(nll_components[seq(3, length(nll_components), by = 2)]),
    nll_components[1]
  ),
  ASAP = c(
    rdat$like$lk.total,
    (rdat$like$lk.catch.total + rdat$like$lk.index.fit.total),
    (rdat$like$lk.catch.age.comp + rdat$like$lk.index.age.comp),
    rdat$like$lk.Recruit.devs
  )
)
print(jnlltab)
  Component      FIMS     ASAP
1     Total 2540.1331 2410.844
2     Index 1034.7835 1020.548
3  Age Comp 1360.0192 1390.297
4       Rec  145.3304    0.000

What was your experience using FIMS? What could we do to improve usability?

Relatively easy to use by following the vignette. Creating wrappers for data input would help so that each element did not need to be assigned directly.

List any issues that you ran into or found

Please open an issue if you found something new.

  • SSB calculations in FIMS assume 0.5 multiplier, which differs from ASAP Issue #521.
  • Output all derived values (this is mostly done)
  • Fix recruitment estimation Issue #364
  • Handle missing data, especially surveys Issue #502
  • Weights at age that change over time
  • Separate weights at age for catch, SSB, Jan-1 population, indices, etc.
  • Fishery selectivity blocks or random effects
  • Allow time-varying CVs and ESS (or alternative functions)
  • Option for Index in numbers
  • Timing of Index and SSB calculations within the year
  • One-step-ahead residuals
  • Reference points, projections, pushbutton retro

What features are most important to add based on this case study?

  • Missing values, would allow inclusion of the other 3 indices (too many missing years to fill for this example)
Code
# Clear C++ objects from memory
FIMS::clear()