Introducing the Fisheries Integrated Modeling System (FIMS)
Source:vignettes/fims-demo.Rmd
fims-demo.Rmd
FIMS
The NOAA Fisheries Integrated Modeling System (FIMS) is a new modeling framework for fisheries modeling. The framework is designed to support next-generation fisheries stock assessment, ecosystem, and socioeconomic modeling. It is important to note that FIMS itself is not a model but rather a framework for creating models. The framework is made up of many modules that come together to create a model that best suits the needs of the end-user. The remainder of this vignette walks through what is absolutely necessary to run a FIMS catch-at-age model using the default settings.
Memory
Calling library(FIMS)
loads the R package and Rcpp
functions and modules into the R environment. The C++ code is compiled
upon installation rather than loading so the call to
library()
should be pretty fast. Users should always run
clear()
prior to modeling to ensure that the C++ memory
from any previous FIMS model run is cleared out.
Data
Data for a FIMS model must be stored in a single data frame with a
long format, e.g., data("data1", package = "FIMS")
. The
design is similar to running a linear model where you pass a single data
frame to lm()
. The long format does lead to some
information being duplicated. For example, the units are listed for
every row rather than stored in a single location for each data type.
But, the long format facilitates using tidy functions to manipulate the
data. And, a single function, i.e., FIMSFrame()
, is all
that is needed to prepare the data to be used in a FIMS model.
data1
A sample data frame for a catch-at-age model with both ages and
lengths is stored in the package as data1
. This data set is
based on data that was used in Li et al. for the
Model Comparison Project (github
site). The length data have since been added data-raw/data1.R
based on an age-length conversion matrix. See R/data1.R
or ?data1
for details about the package data.
FIMSFrame()
The easiest way to prepare the data for a FIMS model is to use
FIMSFrame()
. This function performs several validation
checks and returns an object of the S4 class called
FIMSFrame
. There are helper functions for working with a
FIMSFrame
object, e.g., get_data()
,
get_n_years()
, get_*()
. Additionally, there
are helper functions for pulling data out of the S4 class in the format
needed for a module, i.e., a vector, but these m_*()
functions are largely meant to be used internally within the package and
are only exported to allow for their use by power users wishing to
manually set up.
# Bring the package data into your environment
data("data1")
# Prepare the package data for being used in a FIMS model
data_4_model <- FIMSFrame(data1)
The S4 object that we named data_4_model
contains many
slots (i.e., named components of the object that can be accessed) but
perhaps the most interesting one is the long data frame stored in the
“data” slot. The tibble stored in this slot can be accessed using
get_data()
.
# Use show() to see what is stored in the FIMSFrame S4 class
methods::show(data_4_model)
## tbl_df of class 'FIMSFrame'
## with the following 'types': age, age-to-length-conversion, landings, length, weight-at-age, index
## # A tibble: 6 × 10
## type name age length datestart dateend value unit uncertainty year
## <chr> <chr> <int> <dbl> <date> <date> <dbl> <chr> <dbl> <dbl>
## 1 age fleet1 1 NA 1-01-01 1-12-31 0.07 proportion 200 1
## 2 age fleet1 2 NA 1-01-01 1-12-31 0.1 proportion 200 1
## 3 age fleet1 3 NA 1-01-01 1-12-31 0.115 proportion 200 1
## 4 age fleet1 4 NA 1-01-01 1-12-31 0.15 proportion 200 1
## 5 age fleet1 5 NA 1-01-01 1-12-31 0.1 proportion 200 1
## 6 age fleet1 6 NA 1-01-01 1-12-31 0.05 proportion 200 1
## additional slots include the following:fleets:
## [1] 1
## n_years:
## [1] 30
## ages:
## [1] 1 2 3 4 5 6 7 8 9 10 11 12
## n_ages:
## [1] 12
## lengths:
## [1] 0 50 100 150 200 250 300 350 400 450 500 550 600 650 700
## [16] 750 800 850 900 950 1000 1050 1100
## n_lengths:
## [1] 23
## start_year:
## [1] 1
## end_year:
## [1] 30
# Or, look at the structure using str()
# Increase max.level to see more of the structure
str(data_4_model, max.level = 1)
## Formal class 'FIMSFrame' [package "FIMS"] with 9 slots
# Use dplyr to subset the data for just the landings
get_data(data_4_model) |>
dplyr::filter(type == "landings")
## # A tibble: 30 × 10
## type name age length datestart dateend value unit uncertainty year
## <chr> <chr> <int> <dbl> <date> <date> <dbl> <chr> <dbl> <dbl>
## 1 landi… flee… NA NA 1-01-01 1-12-31 162. mt 0.0100 1
## 2 landi… flee… NA NA 2-01-01 2-12-31 461. mt 0.0100 2
## 3 landi… flee… NA NA 3-01-01 3-12-31 747. mt 0.0100 3
## 4 landi… flee… NA NA 4-01-01 4-12-31 997. mt 0.0100 4
## 5 landi… flee… NA NA 5-01-01 5-12-31 768. mt 0.0100 5
## 6 landi… flee… NA NA 6-01-01 6-12-31 1344. mt 0.0100 6
## 7 landi… flee… NA NA 7-01-01 7-12-31 1319. mt 0.0100 7
## 8 landi… flee… NA NA 8-01-01 8-12-31 2598. mt 0.0100 8
## 9 landi… flee… NA NA 9-01-01 9-12-31 1426. mt 0.0100 9
## 10 landi… flee… NA NA 10-01-01 10-12-31 1644. mt 0.0100 10
## # ℹ 20 more rows
The data contains the following fleets:
- A single fishery fleet with age- and length-composition, weight-at-age, and landings data
- A single survey with age- and length-composition and index data
Parameters
The parameters that are in the model will depend on which modules are used from the FIMS framework. This combination of modules rather than the use of a control file negates the need for complicated if{} else{} statements in the code.
create_default_parameters()
By passing the data to create_default_parameters()
the
function can tailor the defaults based on how many fleets there are and
what data types exist. For example, if you have three fleets, then
create_default_parameters()
will set up three logistic
selectivity modules.
Modules that are available in FIMS are known as reference classes in the C++ code. Each reference class acts as an interface between R and the underlining C++ code that defines FIMS. Several reference classes exist and several more will be created in the future. The beauty of having modules rather than a control file really comes out when more reference classes are created because each reference class can be accessed through R by itself to build up a model rather than needing to modify a control file for future features.
By just passing lists of fleet specifications and the data to
create_default_parameters()
, the default values for
parameters that relate to fleet(s), recruitment, growth, and maturity
modules can be created. For example,
- “BevertonHoltRecruitment” for the recruitment module
- “DnormDistribution” for recruitment deviations (log_devs)
- “EWAAgrowth” for the growth module, and
- “LogisticMaturity” for maturity module.
# Define the same fleet specifications for fleet1 and survey1
fleet1 <- survey1 <- list(
selectivity = list(form = "LogisticSelectivity"),
data_distribution = c(
Index = "DlnormDistribution",
AgeComp = "DmultinomDistribution",
LengthComp = "DmultinomDistribution"
)
)
# Create default recruitment, growth, and maturity parameters
default_parameters <- data_4_model |>
create_default_parameters(
fleets = list(fleet1 = fleet1, survey1 = survey1)
)
The argument names and their corresponding default values of the
create_default_parameters() function can be displayed using
args()
.
args(create_default_parameters)
## function (data, fleets, recruitment = list(form = "BevertonHoltRecruitment",
## process_distribution = c(log_devs = "DnormDistribution")),
## growth = list(form = "EWAAgrowth"), maturity = list(form = "LogisticMaturity"))
## NULL
update_parameters()
In the future, the developers of FIMS may update the default
parameters to experiment with different values. Regardless, you can
still use create_default_parameters()
as a starting point
because it will provide information on the appropriate dimensions and
necessary elements that the final list must contain. And, although it is
a good idea to modify the returned defaults with
update_parameters()
, you can update the list manually.
In the code below, update_parameters()
is used to adjust
the fishing mortality, selectivity, maturity, and population parameters
from their default values. If the parameters are estimated, these
updates will change their starting values, and if they are fixed, these
updates will change their value used in the model.
# Each call to update_parameters() returns the full list so the pipe can be
# used to daisy chain all of these updates together to a new object called
# parameters that will be used to fit the model
parameters <- default_parameters |>
update_parameters(
modified_parameters = list(
fleet1 = list(
Fleet.log_Fmort.value = log(c(
0.009459165, 0.027288858, 0.045063639,
0.061017825, 0.048600752, 0.087420554,
0.088447204, 0.186607929, 0.109008958,
0.132704335, 0.150615473, 0.161242955,
0.116640187, 0.169346119, 0.180191913,
0.161240483, 0.314573212, 0.257247574,
0.254887252, 0.251462108, 0.349101406,
0.254107720, 0.418478117, 0.345721184,
0.343685540, 0.314171227, 0.308026829,
0.431745298, 0.328030899, 0.499675368
))
)
)
) |>
update_parameters(
modified_parameters = list(
survey1 = list(
LogisticSelectivity.inflection_point.value = 1.5,
LogisticSelectivity.slope.value = 2,
Fleet.log_q.value = log(3.315143e-07)
)
)
) |>
update_parameters(
modified_parameters = list(
recruitment = list(
BevertonHoltRecruitment.log_rzero.value = log(1e+06),
BevertonHoltRecruitment.log_devs.value = c(
0.43787763, -0.13299042, -0.43251973, 0.64861200, 0.50640852,
-0.06958319, 0.30246260, -0.08257384, 0.20740372, 0.15289604,
-0.21709207, -0.13320626, 0.11225374, -0.10650836, 0.26877132,
0.24094126, -0.54480751, -0.23680557, -0.58483386, 0.30122785,
0.21930545, -0.22281699, -0.51358369, 0.15740234, -0.53988240,
-0.19556523, 0.20094360, 0.37248740, -0.07163145
),
BevertonHoltRecruitment.log_devs.estimated = FALSE
)
)
) |>
update_parameters(
modified_parameters = list(
maturity = list(
LogisticMaturity.inflection_point.value = 2.25,
LogisticMaturity.inflection_point.estimated = FALSE,
LogisticMaturity.slope.value = 3,
LogisticMaturity.slope.estimated = FALSE
)
)
) |>
update_parameters(
modified_parameters = list(
population = list(
Population.log_init_naa.value = c(
13.80944, 13.60690, 13.40217, 13.19525, 12.98692, 12.77791,
12.56862, 12.35922, 12.14979, 11.94034, 11.73088, 13.18755
)
)
)
)
Fit
With data and parameters in place, we can now initialize modules
using initialize_fims()
and fit the model using
fit_fims()
.
initialize_fims()
The list returned by create_default_parameters()
has two
elements, parameters and modules. But, these are just lists of lists
containing specifications. Nothing has been created in memory as of yet.
To actually initialize the modules specified in
parameters[["modules"]]
, initialize_fims()
needs to be called. This function takes all of the specifications and
matches them with the appropriate data to initialize a module and create
the pointers to the memory.
fit_fims()
The list returned from initialize_fims()
can be passed
to the parameter of fit_fims()
called input
to
run a FIMS model. If optimize = FALSE
, the model will not
actually be optimized but instead just checked to ensure it is a viable
model. When optimize = TRUE
, the model will be fit using
stats::nlminb()
and an object of the class
FIMSFit
will be returned.
Example
# Run the model without optimization to help ensure a viable model
test_fit <- parameters |>
initialize_fims(data = data_4_model) |>
fit_fims(optimize = FALSE)
# Run the model with optimization
fit <- parameters |>
initialize_fims(data = data_4_model) |>
fit_fims(optimize = TRUE)
## ✔ Starting optimization ...
## ℹ Restarting optimizer 3 times to improve gradient.
## ℹ Maximum gradient went from 0.01324 to 0.00163 after 3 steps.
## ✔ Finished optimization
## ✔ Finished sdreport
## ℹ FIMS model version: 0.3.0.1
## ℹ Total run time was 1.77504 seconds
## ℹ Number of parameters: total=48, fixed_effects=48, and random_effects=0
## ℹ Maximum gradient= 0.00163
## ℹ Negative log likelihood (NLL):
## • Marginal NLL= 3299.32402
## • Total NLL= 3299.32402
## ℹ Terminal SB=
# Get information about the model and print a few characters to the screen
recruitment_log <- get_log_module("information")
substr(recruitment_log, 1, 100)
## [1] "[\n{\n\"timestamp\" : \"Fri Jan 17 15:10:51 2025\",\n\"level\" : \"info\",\n\"message\" : \"Initializing fleet 1\",\n"
# Clear memory post-run
clear()
The results can be plotted with either base R or using {ggplot2}.
index_results <- data.frame(
observed = m_index(data_4_model, "survey1"),
expected = get_report(fit)[["exp_index"]][[2]],
years = get_start_year(data_4_model):get_end_year(data_4_model)
)
print(index_results)
## observed expected years
## 1 0.006117418 0.006456369 1
## 2 0.007156588 0.006519662 2
## 3 0.006553376 0.006541949 3
## 4 0.006501745 0.006434017 4
## 5 0.006562572 0.006190002 5
## 6 0.006467796 0.006186283 6
## 7 0.005248220 0.006105242 7
## 8 0.006568311 0.005990934 8
## 9 0.005516030 0.005385850 9
## 10 0.005614254 0.005184481 10
## 11 0.004269091 0.004911050 11
## 12 0.004842784 0.004601776 12
## 13 0.004018395 0.004254148 13
## 14 0.004741539 0.004082059 14
## 15 0.003444776 0.003778219 15
## 16 0.004062555 0.003521058 16
## 17 0.003790370 0.003415195 17
## 18 0.002892876 0.002979243 18
## 19 0.002765623 0.002695108 19
## 20 0.002349581 0.002406346 20
## 21 0.002007219 0.002167876 21
## 22 0.001883646 0.001913676 22
## 23 0.001908972 0.001932545 23
## 24 0.001596426 0.001690079 24
## 25 0.001248675 0.001537510 25
## 26 0.001459593 0.001434889 26
## 27 0.001255182 0.001371307 27
## 28 0.001314476 0.001320515 28
## 29 0.001364266 0.001232807 29
## 30 0.001343069 0.001323075 30
ggplot2::ggplot(index_results, ggplot2::aes(x = years, y = observed)) +
ggplot2::geom_point() +
ggplot2::xlab("Year") +
ggplot2::ylab("Index (mt)") +
ggplot2::geom_line(ggplot2::aes(x = years, y = expected), color = "blue") +
ggplot2::theme_bw()
catch_results <- data.frame(
observed = m_landings(data_4_model, fleet = "fleet1"),
expected = get_report(fit)[["exp_index"]][[1]],
years = get_start_year(data_4_model):get_end_year(data_4_model)
)
print(catch_results)
## observed expected years
## 1 161.6455 161.6443 1
## 2 461.0895 461.0773 2
## 3 747.2900 747.2898 3
## 4 996.9711 997.0286 4
## 5 767.5477 767.5720 5
## 6 1343.8609 1344.2958 6
## 7 1319.0663 1319.7363 7
## 8 2597.5051 2600.7624 8
## 9 1425.8608 1426.5970 9
## 10 1643.6742 1644.3563 10
## 11 1770.8245 1771.6837 11
## 12 1751.2628 1752.6393 12
## 13 1194.0655 1194.6511 13
## 14 1638.4036 1639.2516 14
## 15 1584.1855 1584.5228 15
## 16 1333.1820 1333.1671 16
## 17 2325.4425 2326.6764 17
## 18 1693.7841 1694.2546 18
## 19 1531.8238 1532.1958 19
## 20 1358.8162 1359.6414 20
## 21 1638.8798 1638.8533 21
## 22 1077.3713 1076.1837 22
## 23 1647.9104 1643.9847 23
## 24 1224.4634 1222.6682 24
## 25 1126.9182 1125.9607 25
## 26 956.4529 956.2479 26
## 27 902.1621 901.9440 27
## 28 1168.8217 1168.6534 28
## 29 852.3819 851.7908 29
## 30 1274.3777 1274.2575 30
Sensitivities
Multiple fits, i.e., sensitivity runs, can be set up by modifying the
parameter list using update_parameters()
or changing the
data that is used to fit the model.
Initial values
For example, one could change the initial value used for the slope of the logistic curve for the survey to see if the terminal estimate changes due to changes to the initial value.
parameters_high_slope <- parameters |>
update_parameters(
modified_parameters = list(
survey1 = list(
LogisticSelectivity.slope.value = 2.5
)
)
)
parameters_low_slope <- parameters |>
update_parameters(
modified_parameters = list(
survey1 = list(
LogisticSelectivity.slope.value = 1
)
)
)
high_slope_fit <- parameters_high_slope |>
initialize_fims(data = data_4_model) |>
fit_fims(optimize = TRUE)
## ✔ Starting optimization ...
## ℹ Restarting optimizer 3 times to improve gradient.
## ℹ Maximum gradient went from 0.03473 to 0.00193 after 3 steps.
## ✔ Finished optimization
## ✔ Finished sdreport
## ℹ FIMS model version: 0.3.0.1
## ℹ Total run time was 1.72079 seconds
## ℹ Number of parameters: total=48, fixed_effects=48, and random_effects=0
## ℹ Maximum gradient= 0.00193
## ℹ Negative log likelihood (NLL):
## • Marginal NLL= 3299.32402
## • Total NLL= 3299.32402
## ℹ Terminal SB=
clear()
low_slope_fit <- parameters_low_slope |>
initialize_fims(data = data_4_model) |>
fit_fims(optimize = TRUE)
## ✔ Starting optimization ...
## ℹ Restarting optimizer 3 times to improve gradient.
## ℹ Maximum gradient went from 0.02996 to 0.00154 after 3 steps.
## ✔ Finished optimization
## ✔ Finished sdreport
## ℹ FIMS model version: 0.3.0.1
## ℹ Total run time was 1.49896 seconds
## ℹ Number of parameters: total=48, fixed_effects=48, and random_effects=0
## ℹ Maximum gradient= 0.00154
## ℹ Negative log likelihood (NLL):
## • Marginal NLL= 3299.32402
## • Total NLL= 3299.32402
## ℹ Terminal SB=
clear()
Age only
The same model can be fit to just the age data, removing the length data.
# Define fleet and survey with age-specific data distribution
fleet1 <- survey1 <- list(
selectivity = list(form = "LogisticSelectivity"),
data_distribution = c(
Index = "DlnormDistribution",
AgeComp = "DmultinomDistribution"
)
)
# Create default parameters, update with modified values, initialize FIMS,
# and fit the model
age_only_fit <- data_4_model |>
create_default_parameters(
fleets = list(fleet1 = fleet1, survey1 = survey1)
) |>
# update_parameters(modified_parameters = parameters$parameters) |>
initialize_fims(data = data_4_model) |>
fit_fims(optimize = TRUE)
## ✔ Starting optimization ...
## ℹ Restarting optimizer 3 times to improve gradient.
## ℹ Maximum gradient went from 0.00298 to 0.00117 after 3 steps.
## ✔ Finished optimization
## ✔ Finished sdreport
## ℹ FIMS model version: 0.3.0.1
## ℹ Total run time was 0.77293 seconds
## ℹ Number of parameters: total=79, fixed_effects=79, and random_effects=0
## ℹ Maximum gradient= 0.00117
## ℹ Negative log likelihood (NLL):
## • Marginal NLL= 2146.10345
## • Total NLL= 2146.10345
## ℹ Terminal SB=
clear()
Length
The same model can be fit to just the length data, removing the age data.
# Define fleet and survey with length-specific data distribution
fleet1 <- survey1 <- list(
selectivity = list(form = "LogisticSelectivity"),
data_distribution = c(
Index = "DlnormDistribution",
LengthComp = "DmultinomDistribution"
)
)
# Create default parameters, update with modified values, initialize FIMS,
# and fit the model
length_only_fit <- data_4_model |>
create_default_parameters(
fleets = list(fleet1 = fleet1, survey1 = survey1)
) |>
update_parameters(modified_parameters = parameters$parameters) |>
initialize_fims(data = data_4_model) |>
fit_fims(optimize = TRUE)
## ✔ Starting optimization ...
## ℹ Restarting optimizer 3 times to improve gradient.
## ℹ Maximum gradient went from 0.0164 to 0.00091 after 3 steps.
## ✔ Finished optimization
## ✔ Finished sdreport
## ℹ FIMS model version: 0.3.0.1
## ℹ Total run time was 1.58338 seconds
## ℹ Number of parameters: total=48, fixed_effects=48, and random_effects=0
## ℹ Maximum gradient= 0.00091
## ℹ Negative log likelihood (NLL):
## • Marginal NLL= 1637.53322
## • Total NLL= 1637.53322
## ℹ Terminal SB=
clear()