Chapter 10 Test case template

In this section we will describe how to write a test case for your FIMS code.

10.1 Introduction

FIMS testing framework will include different types of testing to make sure that changes to FIMS code are working as expected. The unit and functional tests will be developed during the initial development stage when writing individual functions or modules. After completing development of multiple modules, integration testing will be developed to verify that different modules work well together. Checks will be added in the software to catch user input errors when conducting run-time testing. Regression testing and platform compatibility testing will be executed before pre-releasing FIMS. Beta-testing will be used to gather feedback from users (i.e., members of FIMS implementation team and other users) during the pre-release stage. After releasing the first version of FIMS, the development team will go back to the beginning of the testing cycle and write unit tests when a new feature needs to be implemented. One-off testing will be used for testing new features and fixing user-reported bugs when maintaining FIMS. More details of each type of test can be found in the Glossary section.

FIMS will use GoogleTest to build a C++ unit testing framework and R testthat to build an R testing framework. FIMS will use Google Benchmark to measure the real time and CPU time used for running the produced binaries.

10.2 C++ unit testing and benchmarking

10.2.1 Requirements

To use GoogleTest, you will need:

  • A compatible operating system (e.g. Windows, masOS, or Linux).

  • A C++ compiler that supports at least C++ 11 standard or newer (e.g. gcc 5.0+, clang 5.0+, or MSVC 2015+). For macOS users, Xcode 9.3+ provides clang 5.0. For R users, rtools4 includes gcc.

  • A build system for building the testing project. CMake and a compatible build tool such as Ninja are approved by NMFS HQ.

10.2.2 Setup for Windows users

  1. Download CMake 3.22.1 (cmake-3.22.1-windows-x86_64.zip) and put the file folder to Documents\Apps or other preferred folder.

  2. Download ninja v1.10.2 (ninja-win.zip) and put the application to Documents\Apps or other preferred folder.

  3. Open your Command Prompt and type cmake. If you see details of usage, cmake is already in your PATH. If not, follow the the instructions below to add cmake to your PATH.

  4. In the same command prompt, type ninja. If you see a message that starts with ninja:, even if it is an error about not finding build.ninja, this means that ninja is already in your PATH. If ninja is not found, follow the instructions below to add ninja to your path.

10.2.3 Adding cmake and ninja to your PATH on Windows

  1. In the Windows search bar next to the start menu, search for Edit environment variables for your account and open the Environment Variables window.

  2. Click Edit... under the User variables for firstname.lastname section.

  3. Click New, add path to cmake, if needed (e.g., cmake-3.22.1-windows-x86_64\bin or C:\Program Files\CMake\bin are common paths), and click OK. Click New, add path to the location of the Ninja executable, if needed (e.g., Documents\Apps), and click OK.

  4. You may need to restart your computer to update the envirionmental variables.

  5. Note that in certain Fisheries centers, NOAA employees do not have administrative privileges enabled to edit the local environmental path. In this situation it is necessary to create a ticket with IT to add cmake and ninja to your PATH on Windows.

10.2.4 Setup for Linux and Mac users

  1. See CMake installation instructions for installing CMake on other platforms. Add cmake to your PATH.

  2. Download ninja v1.10.2 (ninja-win.zip) and put the binary in your preferred location. Add Ninja to your PATH.

  3. Open a command window and type cmake. If you see usage, cmake is found. If not, cmake may still need to be added to your PATH.

  4. Open a command window and type ninja. If you see a message starting with ninja:, ninja is found. Otherwise, try changing the permissions or adding to your path.

10.2.5 How to edit your PATH and change file permissions for Linux and Mac

To check if the binary is in your path, assuming the binary is named ninja: open a Terminal window and type which ninja and hit enter. If you get nothing returned, then ninja is not in your path. The easiest way to fix this is to move the ninja binary to a folder that’s already in your path. To find existing path folders type echo $PATH in the terminal and hit enter. Now move the ninja binary to one of these folders. For example, in a Terminal window type:

sudo cp ~/Downloads/ninja /usr/bin/

To move ninja from the downloads folder to /usr/bin. You will need to use sudo and enter your password after to have permission to move a file to a folder like /usr/bin/.

Also note that you may need to add executable permissions to the ninja binary after downloading it. You can do that by switching to the folder where you placed the binary (cd /usr/bin/ if you followed the instructions above), and running the command:

sudo chmod +x ninja

Check that ninja is now executable and in your path:

which ninja

If you followed the instructions above, you will see the following line returned:

/usr/bin/ninja

10.2.6 Set up FIMS testing project

Create an empty folder named “FIMS_project.” Within the folder, create a two subfolders: “tests” that will contain the testing code and “src” that will contain the source code. Within the tests subfolder, create a file named CMakeLists.txt. Declare a dependency on GoogleTest by adding the following contents to the CMakeLists.txt file:

cmake_minimum_required(VERSION 3.14)
project(FIMS_project)

# GoogleTest requires at least C++11
set(CMAKE_CXX_STANDARD 11)

include(FetchContent)
FetchContent_Declare(
  googletest
  URL https://github.com/google/googletest/archive/refs/tags/release-1.11.0.zip
)

# For Windows: Prevent overriding the parent project's compiler/linker settings
set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
FetchContent_MakeAvailable(googletest)

10.2.7 Unit test example

Create a file dlognorm.hpp within the src subfolder that contains a simple function:

#include <cmath>

template<class Type>
Type dlognorm(Type x, Type meanlog, Type sdlog){
  Type resid = (log(x)-meanlog)/sdlog;
  Type logres = -log(sqrt(2*M_PI)) - log(sdlog) - Type(0.5)*resid*resid - log(x);
  return logres; 
}

Then, create a test file dlognorm-unit.cpp in the tests subfolder that has a test suite for the dlognorm function:

#include "gtest/gtest.h"
#include "../src/dlognorm.hpp"

// # R code that generates true values for the test
// dlnorm(1.0, 0.0, 1.0, TRUE) = -0.9189385
// dlnorm(5.0, 10.0, 2.5, TRUE) = -9.07679

namespace {

  // TestSuiteName: dlognormTest; TestName: DoubleInput and IntInput
  // Test dlognorm with double input values
  
  TEST(dlognormTest, DoubleInput) {
    
    EXPECT_NEAR( dlognorm(1.0, 0.0, 1.0) , -0.9189385 , 0.0001 ); 
    EXPECT_NEAR( dlognorm(5.0, 10.0, 2.5) , -9.07679 , 0.0001 ); 
    
  }
  
  // Test dlognorm with integer input values
  
  TEST(dlognormTest, IntInput) {
    
    EXPECT_NEAR( dlognorm(1, 0, 1) , -0.9189385 , 0.0001 );
    
  }
  
}

EXPECT_NEAR(val1, val2, absolute_error) verifies that the difference between val1 and val2 does not exceed the absolute error bound absolute_error. EXPECT_NE(val1, val2) verifies that val1 is not equal to val2. Please see GoogleTest assertions reference for more EXPECT_ macros.

10.2.8 Add tests to CMakeLists.txt and run a binary

To build the code, add the following contents to the end of the CMakeLists.txt file:

enable_testing()

add_executable(dlognorm_test
  dlognorm-unit.cpp
)

target_include_directories(dlognorm_test 
  PUBLIC
    ${CMAKE_SOURCE_DIR}/../
)

target_link_libraries(dlognorm_test
  gtest_main
)

include(GoogleTest)
gtest_discover_tests(dlognorm_test)

The above configuration enables testing in CMake, declares the C++ test binary you want to build (dlognorm_test), and links it to GoogleTest (gtest_main). Now you can build and run your test. Open a command window in FIMS_project/tests and type:

cmake -S . -B build -G Ninja

This generates the build system using Ninja as the generator.

Next, in the same command window, use cmake to build:

cd build 
cmake --build .

Finally, run the tests in the same command window:

ctest

The output when running ctest might look like this:

Start 1: dlognormTest.DoubleInput
1/2 Test #1: dlognormTest.DoubleInput .........   Passed    0.17 sec
Start 2: dlognormTest.IntInput
2/2 Test #2: dlognormTest.IntInput ............***Failed    0.11 sec

50% tests passed, 1 tests failed out of 2

Total Test time (real) =   0.32 sec

The following tests FAILED:
          2 - dlognorm.IntInput (Failed)
Errors while running CTest
Output from these tests are in: C:/Users/FIMS/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

10.2.9 Benchmark example

Let’s use Google Benchmark to measure the real time and CPU time used for running the produced binary. We will continue using the dlognorm.hpp example. Create a benchmark file dlognorm_benchmark.cpp and put it in the tests subfolder within the FIMS_project folder:

#include "benchmark/benchmark.h"
#include "../src/dlognorm.hpp"

void BM_dlgnorm(benchmark::State& state)
{
  for (auto _ : state)
    dlognorm(5.0, 10.0, 2.5);
}
BENCHMARK(BM_dlgnorm);

Please see more examples on Google Benchmark GitHub repository for a more comprehensive feature overview.

10.2.10 Add benchmarks to CMakeLists.txt and run the benchmark

To build the code, add the following contents to the end of your CMakeLists.txt file:


FetchContent_Declare(
  googlebenchmark
  URL https://github.com/google/benchmark/archive/refs/tags/v1.6.0.zip
)
FetchContent_MakeAvailable(googlebenchmark)

add_executable(dlognorm_benchmark
  dlognorm_benchmark.cpp
)

target_include_directories(dlognorm_benchmark 
  PUBLIC
    ${CMAKE_SOURCE_DIR}/../
)

target_link_libraries(dlognorm_benchmark
  benchmark_main
)

To run the benchmark, run on the command line open in FIMS_project/tests/build:

cmake --build .
./dlognorm_benchmark.exe

The output from ./dlognorm_benchmark.exe might look like this:


Run on (8 X 2112 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 256 KiB (x4)
L3 Unified 8192 KiB (x1)
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------
  Benchmark           Time             CPU   Iterations
-----------------------------------------------------
  BM_dlgnorm        153 ns          153 ns      4480000

10.2.11 Clean up after running C++ tests

10.2.11.1 Clean up CMake-generated files and re-run tests

After running the examples above, the build generates files (i.e., the source code, libraries, and executables) and saves the files in build folder. The examples above demonstrate an “out-of-source” build which puts generated files in a completely separate directory build, so that the source tree is unchanged after running tests. Using a separate source and build tree reduces the need for clean away files that differ between builds. If you still would like to clean CMake-generated files, just delete the build folder, and then build and run tests by repeating the commands below. The files from the build folder should not be pushed to the FIMS repository.

cmake -S . -B build -G Ninja
cd build 
cmake --build . --parallel 16
ctest

Here --parallel 16 means that the maximum number of concurrent processes to use when building is 16.

To build and run tests without navigating to the build folder, you can use the commands below:

cmake -S . -B build -G Ninja 
cmake --build build --parallel 16
ctest --test-dir build

10.2.11.2 Clean up individual tests

For simple C++ function like the examples above, we do not need to clean up the tests. If you allocated memory for an object during testing, you need to delete the object (e.g., delete object). If you used test fixture from GoogleTest to use the same data configuration for multiple tests, TearDown() can be used to clean up the test and then the test fixture will be deleted. Please see more details from GoogleTest user’s guide.

10.3 Templates for GoogleTest testing

This section includes templates for creating unit tests and benchmark. This is the code that would go into the .cpp files.

10.3.1 Unit test template

#include "gtest/gtest.h"
#include "../src/code.hpp"

// # R code that generates true values for the test

namespace {

  // Description of Test 1 
  TEST(TestSuiteName, Test1Name) {
    
    ... test body ... 
    
  }
  
  // Description of Test 2
  TEST(TestSuiteName, Test2Name) {
    
    ... test body ...
    
  }
  
}

10.3.2 Benchmark template

#include "benchmark/benchmark.h"
#include "../src/code.hpp"

void BM_FunctionName(benchmark::State& state)
{
  for (auto _ : state)
    // This code gets timed
    Function()
}

// Register the function as a benchmark
BENCHMARK(BM_FunctionName);

10.3.3 CMakeLists.txt template

// Add test suite 1
add_executable(TestSuiteName1
  test1.cpp
)

target_link_libraries(TestSuiteName1
  gtest_main
)

gtest_discover_tests(TestSuiteName1)

// Add test suite 2
add_executable(TestSuiteName2
  test2.cpp
)

target_link_libraries(TestSuiteName2
  gtest_main
)

gtest_discover_tests(TestSuiteName2)

// Add benchmark 1
add_executable(benchmark1
  benchmark1.cpp
)

target_link_libraries(benchmark1
  benchmark_main
)

// Add benchmark 2
add_executable(benchmark2
  benchmark2.cpp
)

target_link_libraries(benchmark2
  benchmark_main
)

10.4 R testing

FIMS uses R testthat package for writing R tests. You can install the packages following the instructions on testthat website. If you are not familiar with testthat, the testing chapter in R packages gives a good overview of testing workflow, along with structure explanation and concrete examples.

10.4.1 R testthat template

test_that("TestName", {
  
  ...test body...
  
})

10.5 Test case template and examples

10.5.1 Test case template

Individual functional or integration test cases will be designed following the template below.

  • Test ID. Create a meaningful name for the test case.

  • Features to be tested. Provide a brief statement of test objectives and description of the features to be tested. (Identify the test items following the FIMS software design specification document and identify all features that will not be tested and the rationale for exclusion)

  • Approach. Specify the approach that will ensure that the features are adequately tested and specify which type of test is used in this case.

  • Evaluation criteria. Provide a list of expected results and acceptance criteria.

    • Pass/fail criteria. Specify the criteria used to determine whether each feature has passed or failed testing.

    • In addition to setting pass/fail criteria with specific tolerance values, a documentation that just views the outputs of some tests may be useful if the tests require additional computations, simulations, and comparisons

  • Test deliverables. Identify all information that is to be delivered by the test activity.

    • Test logs and automated status reports

10.5.2 Test case examples

10.5.2.1 General test case

The test case below is a general case and it can be applied to many functions/modules. For individual functions/modules, please make detailed test cases for specific options to avoid duplication as much as possible.

Test ID General test case
Features to be tested
  • The function/module returns correct output values given different input values

  • The function/module returns error messages when users give wrong types of inputs

  • The function/module notifies an error if the input value is outside the bound of the input parameter

Approach
  • Prepare expected true values using R

  • Run tests in R using testthat and compare output values with expected values

  • Push tests to the working repository and run tests using GitHub Actions

  • Run tests in different OS environments (windows latest, macOS latest, and ubuntu latest) using GitHub Actions

  • Submit pull request for code review

Evaluation Criteria
  • The tests pass if the output values equal to the expected true values

  • The tests pass if the function/module returns error messages when users give wrong types of inputs

  • The tests pass if the function/module returns error messages when user provides an input value that is outside the bound of the input parameter

Test deliverables
  • Test logs on GitHub Actions

10.5.2.2 Functional test example: TMB probability mass function of the multinomial distribution

Test ID Probability mass function of the multinomial distribution
Features to be tested
  • Same as the general test case
Approach

Functional test

  • Prepare expected true values using R function dmultinom from package ‘stats’
Evaluation Criteria
  • Same as the general test case
Test deliverables
  • Same as the general test case

10.5.2.3 Integration test example: Li et al. 2021 age-structured stock assessment model comparison

Test ID Age-structured stock assessment comparison (Li et al. 2021)
Features to be tested
  • Null case (update standard deviation of the log of recruitment from 0.2 to 0.5 based on Siegfried et al. 2016 snapper-grouper complex)

  • Recruitment variability

  • Stochastic Fishing mortality (F)

  • F patterns (e.g., roller coaster: up then down and down then up; constant Flow, FMSY, and Fhigh)

  • Selectivity patterns

  • Recruitment bias adjustment

  • Initial condition

  • (unit of catch: number or weight)

  • Model misspecification (e.g., growth, natural mortality, and steepness, catchability etc)

Approach

Integration test

Evaluation Criteria
  • Summarize median absolute relative error (MARE) between true values from the operating model and the FIMS estimation model

  • If all MAREs from the null case are less than 10% and all MARES are less than 15%, the tests pass. If the MAREs are greater than 15%, a closer examination is needed.

Test deliverables
  • In addition to the test logs on GitHub Actions, a document that includes comparison figures from various cases (e.g., Fig 5 and 6 from Li et al. 2021) will be automatically generated

  • A table that shows median absolute relative errors in unfished recruitment, catchability, spawning stock biomass, recruitment, fishing mortality, and reference points (e.g., Table 6 from Li et al. 2021) will be automatically generated

10.5.2.4 simulation testing: challenges and solutions

One thing that might be challenging for comparing simulation results is that changes to the order of different calls to simulation will change the simulated values and then tests may fail even though it is just because different random numbers are used or the order of the simulation changes through model development. Several solutions could be used to address the simulation testing issue. Please see discussions on the FIMS-planning issue page for details.

  • Once we start developing simulation modules, there are two ways that help compare simulated data from FIMS and a test.
    • Add a TRUE/FALSE parameter in each FIMS simulation module for setting up testing seed. When testing the module, use the parameter=TRUE to fix the seed number in R and conduct tests.
    • If adding a TRUE/FALSE parameter does not work as expected, then carefully check simulated data from each component and make sure it is not a model coding error.
  • FIMS will use set.seed() from R to set seed. “rstream” package will be investigated if one of the requirements of FIMS simulation module is to generate multiple streams of random numbers to associate distinct streams of random numbers with different sources of randomness. rstream was specifically designed to address the issue of needing very long streams of pseudo-random numbers for parallel computations. Please see rstream paper and RngStreams for more details.