Skip to contents

FIMS

The NOAA Fisheries Integrated Modeling System (FIMS) is a new modeling framework for fisheries modeling. The framework is designed to support next-generation fisheries stock assessment, ecosystem, and socioeconomic modeling. It is important to note that FIMS itself is not a model but rather a framework for creating models. The framework is made up of many modules that come together to create a model that best suits the needs of the end-user. The remainder of this vignette walks through what is absolutely necessary to run a FIMS catch-at-age model using the default settings.

Memory

Calling library(FIMS) loads the R package and Rcpp functions and modules into the R environment. The C++ code is compiled upon installation rather than loading so the call to library() should be pretty fast. Users should always run clear() prior to modeling to ensure that the C++ memory from any previous FIMS model run is cleared out.

library(FIMS)
library(ggplot2)

# clear memory
clear()

Data

Data for a FIMS model must be stored in a single data frame with a long format, e.g., data("data1", package = "FIMS"). The design is similar to running a linear model where you pass a single data frame to lm(). The long format does lead to some information being duplicated. For example, the units are listed for every row rather than stored in a single location for each data type. But, the long format facilitates using tidy functions to manipulate the data. And, a single function, i.e., FIMSFrame(), is all that is needed to prepare the data to be used in a FIMS model.

data1

A sample data frame for a catch-at-age model with both ages and lengths is stored in the package as data1. This data set is based on data that was used in Li et al. for the Model Comparison Project (github site). The length data have since been added data-raw/data1.R based on an age-length conversion matrix. See R/data1.R or ?data1 for details about the package data.

FIMSFrame()

The easiest way to prepare the data for a FIMS model is to use FIMSFrame(). This function performs several validation checks and returns an object of the S4 class called FIMSFrame. There are helper functions for working with a FIMSFrame object, e.g., get_data(), get_n_years(), get_*(). Additionally, there are helper functions for pulling data out of the S4 class in the format needed for a module, i.e., a vector, but these m_*() functions are largely meant to be used internally within the package and are only exported to allow for their use by power users wishing to manually set up.

# Bring the package data into your environment
data("data1")
# Prepare the package data for being used in a FIMS model
data_4_model <- FIMSFrame(data1)

The S4 object that we named data_4_model contains many slots (i.e., named components of the object that can be accessed) but perhaps the most interesting one is the long data frame stored in the “data” slot. The tibble stored in this slot can be accessed using get_data().

# Use show() to see what is stored in the FIMSFrame S4 class
methods::show(data_4_model)
## tbl_df of class 'FIMSFrame'
## with the following 'types': age, age-to-length-conversion, landings, length, weight-at-age, index
## # A tibble: 6 × 10
##   type  name     age length datestart dateend value unit       uncertainty  year
##   <chr> <chr>  <int>  <dbl> <date>    <date>  <dbl> <chr>            <dbl> <dbl>
## 1 age   fleet1     1     NA 1-01-01   1-12-31 0.07  proportion         200     1
## 2 age   fleet1     2     NA 1-01-01   1-12-31 0.1   proportion         200     1
## 3 age   fleet1     3     NA 1-01-01   1-12-31 0.115 proportion         200     1
## 4 age   fleet1     4     NA 1-01-01   1-12-31 0.15  proportion         200     1
## 5 age   fleet1     5     NA 1-01-01   1-12-31 0.1   proportion         200     1
## 6 age   fleet1     6     NA 1-01-01   1-12-31 0.05  proportion         200     1
## additional slots include the following:fleets:
## [1] 1
## n_years:
## [1] 30
## ages:
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12
## n_ages:
## [1] 12
## lengths:
##  [1]    0   50  100  150  200  250  300  350  400  450  500  550  600  650  700
## [16]  750  800  850  900  950 1000 1050 1100
## n_lengths:
## [1] 23
## start_year:
## [1] 1
## end_year:
## [1] 30
# Or, look at the structure using str()
# Increase max.level to see more of the structure
str(data_4_model, max.level = 1)
## Formal class 'FIMSFrame' [package "FIMS"] with 9 slots
# Use dplyr to subset the data for just the landings
get_data(data_4_model) |>
  dplyr::filter(type == "landings")
## # A tibble: 30 × 10
##    type   name    age length datestart dateend  value unit  uncertainty  year
##    <chr>  <chr> <int>  <dbl> <date>    <date>   <dbl> <chr>       <dbl> <dbl>
##  1 landi… flee…    NA     NA 1-01-01   1-12-31   162. mt         0.0100     1
##  2 landi… flee…    NA     NA 2-01-01   2-12-31   461. mt         0.0100     2
##  3 landi… flee…    NA     NA 3-01-01   3-12-31   747. mt         0.0100     3
##  4 landi… flee…    NA     NA 4-01-01   4-12-31   997. mt         0.0100     4
##  5 landi… flee…    NA     NA 5-01-01   5-12-31   768. mt         0.0100     5
##  6 landi… flee…    NA     NA 6-01-01   6-12-31  1344. mt         0.0100     6
##  7 landi… flee…    NA     NA 7-01-01   7-12-31  1319. mt         0.0100     7
##  8 landi… flee…    NA     NA 8-01-01   8-12-31  2598. mt         0.0100     8
##  9 landi… flee…    NA     NA 9-01-01   9-12-31  1426. mt         0.0100     9
## 10 landi… flee…    NA     NA 10-01-01  10-12-31 1644. mt         0.0100    10
## # ℹ 20 more rows

The data contains the following fleets:

  • A single fishery fleet with age- and length-composition, weight-at-age, and landings data
  • A single survey with age- and length-composition and index data

Parameters

The parameters that are in the model will depend on which modules are used from the FIMS framework. This combination of modules rather than the use of a control file negates the need for complicated if{} else{} statements in the code.

create_default_parameters()

By passing the data to create_default_parameters() the function can tailor the defaults based on how many fleets there are and what data types exist. For example, if you have three fleets, then create_default_parameters() will set up three logistic selectivity modules.

Modules that are available in FIMS are known as reference classes in the C++ code. Each reference class acts as an interface between R and the underlining C++ code that defines FIMS. Several reference classes exist and several more will be created in the future. The beauty of having modules rather than a control file really comes out when more reference classes are created because each reference class can be accessed through R by itself to build up a model rather than needing to modify a control file for future features.

By just passing lists of fleet specifications and the data to create_default_parameters(), the default values for parameters that relate to fleet(s), recruitment, growth, and maturity modules can be created. For example,

  • “BevertonHoltRecruitment” for the recruitment module
  • “DnormDistribution” for recruitment deviations (log_devs)
  • “EWAAgrowth” for the growth module, and
  • “LogisticMaturity” for maturity module.
# Define the same fleet specifications for fleet1 and survey1
fleet1 <- survey1 <- list(
  selectivity = list(form = "LogisticSelectivity"),
  data_distribution = c(
    Index = "DlnormDistribution",
    AgeComp = "DmultinomDistribution",
    LengthComp = "DmultinomDistribution"
  )
)

# Create default recruitment, growth, and maturity parameters
default_parameters <- data_4_model |>
  create_default_parameters(
    fleets = list(fleet1 = fleet1, survey1 = survey1)
  )

The argument names and their corresponding default values of the create_default_parameters() function can be displayed using args().

args(create_default_parameters)
## function (data, fleets, recruitment = list(form = "BevertonHoltRecruitment", 
##     process_distribution = c(log_devs = "DnormDistribution")), 
##     growth = list(form = "EWAAgrowth"), maturity = list(form = "LogisticMaturity")) 
## NULL

update_parameters()

In the future, the developers of FIMS may update the default parameters to experiment with different values. Regardless, you can still use create_default_parameters() as a starting point because it will provide information on the appropriate dimensions and necessary elements that the final list must contain. And, although it is a good idea to modify the returned defaults with update_parameters(), you can update the list manually.

In the code below, update_parameters() is used to adjust the fishing mortality, selectivity, maturity, and population parameters from their default values. If the parameters are estimated, these updates will change their starting values, and if they are fixed, these updates will change their value used in the model.

# Each call to update_parameters() returns the full list so the pipe can be
# used to daisy chain all of these updates together to a new object called
# parameters that will be used to fit the model
parameters <- default_parameters |>
  update_parameters(
    modified_parameters = list(
      fleet1 = list(
        Fleet.log_Fmort.value = log(c(
          0.009459165, 0.027288858, 0.045063639,
          0.061017825, 0.048600752, 0.087420554,
          0.088447204, 0.186607929, 0.109008958,
          0.132704335, 0.150615473, 0.161242955,
          0.116640187, 0.169346119, 0.180191913,
          0.161240483, 0.314573212, 0.257247574,
          0.254887252, 0.251462108, 0.349101406,
          0.254107720, 0.418478117, 0.345721184,
          0.343685540, 0.314171227, 0.308026829,
          0.431745298, 0.328030899, 0.499675368
        ))
      )
    )
  ) |>
  update_parameters(
    modified_parameters = list(
      survey1 = list(
        LogisticSelectivity.inflection_point.value = 1.5,
        LogisticSelectivity.slope.value = 2,
        Fleet.log_q.value = log(3.315143e-07)
      )
    )
  ) |>
  update_parameters(
    modified_parameters = list(
      recruitment = list(
        BevertonHoltRecruitment.log_rzero.value = log(1e+06),
        BevertonHoltRecruitment.log_devs.value = c(
          0.43787763, -0.13299042, -0.43251973, 0.64861200, 0.50640852,
          -0.06958319, 0.30246260, -0.08257384, 0.20740372, 0.15289604,
          -0.21709207, -0.13320626, 0.11225374, -0.10650836, 0.26877132,
          0.24094126, -0.54480751, -0.23680557, -0.58483386, 0.30122785,
          0.21930545, -0.22281699, -0.51358369, 0.15740234, -0.53988240,
          -0.19556523, 0.20094360, 0.37248740, -0.07163145
        ),
        BevertonHoltRecruitment.log_devs.estimated = FALSE
      )
    )
  ) |>
  update_parameters(
    modified_parameters = list(
      maturity = list(
        LogisticMaturity.inflection_point.value = 2.25,
        LogisticMaturity.inflection_point.estimated = FALSE,
        LogisticMaturity.slope.value = 3,
        LogisticMaturity.slope.estimated = FALSE
      )
    )
  ) |>
  update_parameters(
    modified_parameters = list(
      population = list(
        Population.log_init_naa.value = c(
          13.80944, 13.60690, 13.40217, 13.19525, 12.98692, 12.77791,
          12.56862, 12.35922, 12.14979, 11.94034, 11.73088, 13.18755
        )
      )
    )
  )

Fit

With data and parameters in place, we can now initialize modules using initialize_fims() and fit the model using fit_fims().

initialize_fims()

The list returned by create_default_parameters() has two elements, parameters and modules. But, these are just lists of lists containing specifications. Nothing has been created in memory as of yet. To actually initialize the modules specified in parameters[["modules"]], initialize_fims() needs to be called. This function takes all of the specifications and matches them with the appropriate data to initialize a module and create the pointers to the memory.

fit_fims()

The list returned from initialize_fims() can be passed to the parameter of fit_fims() called input to run a FIMS model. If optimize = FALSE, the model will not actually be optimized but instead just checked to ensure it is a viable model. When optimize = TRUE, the model will be fit using stats::nlminb() and an object of the class FIMSFit will be returned.

Example

# Run the model without optimization to help ensure a viable model
test_fit <- parameters |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = FALSE)

# Run the  model with optimization
fit <- parameters |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = TRUE)
##  Starting optimization ...
##  Restarting optimizer 3 times to improve gradient.
##  Maximum gradient went from 0.01324 to 0.00163 after 3 steps.
##  Finished optimization
##  Finished sdreport
##  FIMS model version: 0.3.0.1
##  Total run time was 1.77504 seconds
##  Number of parameters: total=48, fixed_effects=48, and random_effects=0
##  Maximum gradient= 0.00163
##  Negative log likelihood (NLL):
##  Marginal NLL= 3299.32402
##  Total NLL= 3299.32402
##  Terminal SB=
# Get information about the model and print a few characters to the screen
recruitment_log <- get_log_module("information")
substr(recruitment_log, 1, 100)
## [1] "[\n{\n\"timestamp\" : \"Fri Jan 17 15:10:51 2025\",\n\"level\" : \"info\",\n\"message\" : \"Initializing fleet 1\",\n"
# Clear memory post-run
clear()

The results can be plotted with either base R or using {ggplot2}.

index_results <- data.frame(
  observed = m_index(data_4_model, "survey1"),
  expected = get_report(fit)[["exp_index"]][[2]],
  years = get_start_year(data_4_model):get_end_year(data_4_model)
)
print(index_results)
##       observed    expected years
## 1  0.006117418 0.006456369     1
## 2  0.007156588 0.006519662     2
## 3  0.006553376 0.006541949     3
## 4  0.006501745 0.006434017     4
## 5  0.006562572 0.006190002     5
## 6  0.006467796 0.006186283     6
## 7  0.005248220 0.006105242     7
## 8  0.006568311 0.005990934     8
## 9  0.005516030 0.005385850     9
## 10 0.005614254 0.005184481    10
## 11 0.004269091 0.004911050    11
## 12 0.004842784 0.004601776    12
## 13 0.004018395 0.004254148    13
## 14 0.004741539 0.004082059    14
## 15 0.003444776 0.003778219    15
## 16 0.004062555 0.003521058    16
## 17 0.003790370 0.003415195    17
## 18 0.002892876 0.002979243    18
## 19 0.002765623 0.002695108    19
## 20 0.002349581 0.002406346    20
## 21 0.002007219 0.002167876    21
## 22 0.001883646 0.001913676    22
## 23 0.001908972 0.001932545    23
## 24 0.001596426 0.001690079    24
## 25 0.001248675 0.001537510    25
## 26 0.001459593 0.001434889    26
## 27 0.001255182 0.001371307    27
## 28 0.001314476 0.001320515    28
## 29 0.001364266 0.001232807    29
## 30 0.001343069 0.001323075    30
ggplot2::ggplot(index_results, ggplot2::aes(x = years, y = observed)) +
  ggplot2::geom_point() +
  ggplot2::xlab("Year") +
  ggplot2::ylab("Index (mt)") +
  ggplot2::geom_line(ggplot2::aes(x = years, y = expected), color = "blue") +
  ggplot2::theme_bw()

catch_results <- data.frame(
  observed = m_landings(data_4_model, fleet = "fleet1"),
  expected = get_report(fit)[["exp_index"]][[1]],
  years = get_start_year(data_4_model):get_end_year(data_4_model)
)
print(catch_results)
##     observed  expected years
## 1   161.6455  161.6443     1
## 2   461.0895  461.0773     2
## 3   747.2900  747.2898     3
## 4   996.9711  997.0286     4
## 5   767.5477  767.5720     5
## 6  1343.8609 1344.2958     6
## 7  1319.0663 1319.7363     7
## 8  2597.5051 2600.7624     8
## 9  1425.8608 1426.5970     9
## 10 1643.6742 1644.3563    10
## 11 1770.8245 1771.6837    11
## 12 1751.2628 1752.6393    12
## 13 1194.0655 1194.6511    13
## 14 1638.4036 1639.2516    14
## 15 1584.1855 1584.5228    15
## 16 1333.1820 1333.1671    16
## 17 2325.4425 2326.6764    17
## 18 1693.7841 1694.2546    18
## 19 1531.8238 1532.1958    19
## 20 1358.8162 1359.6414    20
## 21 1638.8798 1638.8533    21
## 22 1077.3713 1076.1837    22
## 23 1647.9104 1643.9847    23
## 24 1224.4634 1222.6682    24
## 25 1126.9182 1125.9607    25
## 26  956.4529  956.2479    26
## 27  902.1621  901.9440    27
## 28 1168.8217 1168.6534    28
## 29  852.3819  851.7908    29
## 30 1274.3777 1274.2575    30
ggplot2::ggplot(catch_results, ggplot2::aes(x = years, y = observed)) +
  ggplot2::geom_point() +
  ggplot2::xlab("Year") +
  ggplot2::ylab("Index (mt)") +
  ggplot2::geom_line(ggplot2::aes(x = years, y = expected), color = "blue") +
  ggplot2::theme_bw()

Sensitivities

Multiple fits, i.e., sensitivity runs, can be set up by modifying the parameter list using update_parameters() or changing the data that is used to fit the model.

Initial values

For example, one could change the initial value used for the slope of the logistic curve for the survey to see if the terminal estimate changes due to changes to the initial value.

parameters_high_slope <- parameters |>
  update_parameters(
    modified_parameters = list(
      survey1 = list(
        LogisticSelectivity.slope.value = 2.5
      )
    )
  )

parameters_low_slope <- parameters |>
  update_parameters(
    modified_parameters = list(
      survey1 = list(
        LogisticSelectivity.slope.value = 1
      )
    )
  )

high_slope_fit <- parameters_high_slope |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = TRUE)
##  Starting optimization ...
##  Restarting optimizer 3 times to improve gradient.
##  Maximum gradient went from 0.03473 to 0.00193 after 3 steps.
##  Finished optimization
##  Finished sdreport
##  FIMS model version: 0.3.0.1
##  Total run time was 1.72079 seconds
##  Number of parameters: total=48, fixed_effects=48, and random_effects=0
##  Maximum gradient= 0.00193
##  Negative log likelihood (NLL):
##  Marginal NLL= 3299.32402
##  Total NLL= 3299.32402
##  Terminal SB=
clear()

low_slope_fit <- parameters_low_slope |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = TRUE)
##  Starting optimization ...
##  Restarting optimizer 3 times to improve gradient.
##  Maximum gradient went from 0.02996 to 0.00154 after 3 steps.
##  Finished optimization
##  Finished sdreport
##  FIMS model version: 0.3.0.1
##  Total run time was 1.49896 seconds
##  Number of parameters: total=48, fixed_effects=48, and random_effects=0
##  Maximum gradient= 0.00154
##  Negative log likelihood (NLL):
##  Marginal NLL= 3299.32402
##  Total NLL= 3299.32402
##  Terminal SB=
clear()

Age only

The same model can be fit to just the age data, removing the length data.

# Define fleet and survey with age-specific data distribution
fleet1 <- survey1 <- list(
  selectivity = list(form = "LogisticSelectivity"),
  data_distribution = c(
    Index = "DlnormDistribution",
    AgeComp = "DmultinomDistribution"
  )
)

# Create default parameters, update with modified values, initialize FIMS,
# and fit the model
age_only_fit <- data_4_model |>
  create_default_parameters(
    fleets = list(fleet1 = fleet1, survey1 = survey1)
  ) |>
  # update_parameters(modified_parameters = parameters$parameters) |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = TRUE)
##  Starting optimization ...
##  Restarting optimizer 3 times to improve gradient.
##  Maximum gradient went from 0.00298 to 0.00117 after 3 steps.
##  Finished optimization
##  Finished sdreport
##  FIMS model version: 0.3.0.1
##  Total run time was 0.77293 seconds
##  Number of parameters: total=79, fixed_effects=79, and random_effects=0
##  Maximum gradient= 0.00117
##  Negative log likelihood (NLL):
##  Marginal NLL= 2146.10345
##  Total NLL= 2146.10345
##  Terminal SB=
clear()

Length

The same model can be fit to just the length data, removing the age data.

# Define fleet and survey with length-specific data distribution
fleet1 <- survey1 <- list(
  selectivity = list(form = "LogisticSelectivity"),
  data_distribution = c(
    Index = "DlnormDistribution",
    LengthComp = "DmultinomDistribution"
  )
)

# Create default parameters, update with modified values, initialize FIMS,
# and fit the model
length_only_fit <- data_4_model |>
  create_default_parameters(
    fleets = list(fleet1 = fleet1, survey1 = survey1)
  ) |>
  update_parameters(modified_parameters = parameters$parameters) |>
  initialize_fims(data = data_4_model) |>
  fit_fims(optimize = TRUE)
##  Starting optimization ...
##  Restarting optimizer 3 times to improve gradient.
##  Maximum gradient went from 0.0164 to 0.00091 after 3 steps.
##  Finished optimization
##  Finished sdreport
##  FIMS model version: 0.3.0.1
##  Total run time was 1.58338 seconds
##  Number of parameters: total=48, fixed_effects=48, and random_effects=0
##  Maximum gradient= 0.00091
##  Negative log likelihood (NLL):
##  Marginal NLL= 1637.53322
##  Total NLL= 1637.53322
##  Terminal SB=
clear()