Chapter 10 Testing
This section describes testing for FIMS. FIMS uses Google Test for C++ unit testing and testthat for R unit testing.
10.1 Introduction
FIMS testing framework will include different types of testing to make sure that changes to FIMS code are working as expected. The unit and functional tests will be developed during the initial development stage when writing individual functions or modules. After completing development of multiple modules, integration testing will be developed to verify that different modules work well together. Checks will be added in the software to catch user input errors when conducting run-time testing. Regression testing and platform compatibility testing will be executed before pre-releasing FIMS. Beta-testing will be used to gather feedback from users (i.e., members of FIMS implementation team and other users) during the pre-release stage. After releasing the first version of FIMS, the development team will go back to the beginning of the testing cycle and write unit tests when a new feature needs to be implemented. One-off testing will be used for testing new features and fixing user-reported bugs when maintaining FIMS. More details of each type of test can be found in the Glossary section.
FIMS will use GoogleTest to build a C++ unit testing framework and R testthat to build an R testing framework. FIMS will use Google Benchmark to measure the real time and CPU time used for running the produced binaries.
10.2 C++ unit testing and benchmarking
10.2.1 Requirements
To use GoogleTest, you will need:
A compatible operating system (e.g. Windows, masOS, or Linux).
A C++ compiler that supports at least C++ 11 standard or newer (e.g. gcc 5.0+, clang 5.0+, or MSVC 2015+). For macOS users, Xcode 9.3+ provides clang 5.0. For R users, rtools4 includes gcc.
A build system for building the testing project. CMake and a compatible build tool such as Ninja are approved by NMFS HQ.
10.2.2 Setup for Windows users
Download CMake 3.22.1 (cmake-3.22.1-windows-x86_64.zip) and put the file folder to
Documents\Apps
or other preferred folder.Download ninja v1.10.2 (ninja-win.zip) and put the application to
Documents\Apps
or other preferred folder.Open your Command Prompt and type
cmake
. If you see details of usage, cmake is already in your PATH. If not, follow the the instructions below to add cmake to your PATH.In the same command prompt, type
ninja
. If you see a message that starts withninja:
, even if it is an error about not finding build.ninja, this means that ninja is already in your PATH. If ninja is not found, follow the instructions below to add ninja to your path.
10.2.3 Adding cmake and ninja to your PATH on Windows
In the Windows search bar next to the start menu, search for
Edit environment variables for your account
and open theEnvironment Variables
window.Click
Edit...
under theUser variables for firstname.lastname
section.Click
New
, add path to cmake, if needed (e.g.,cmake-3.22.1-windows-x86_64\bin
orC:\Program Files\CMake\bin
are common paths), and clickOK
. ClickNew
, add path to the location of the Ninja executable, if needed (e.g.,Documents\Apps\ninja-win
orC:\Program Files\ninja-win
), and clickOK
.You may need to restart your computer to update the envirionmental variables. You can check that the path is working by running
where cmake
orwhere ninja
in a command terminal.Note that in certain Fisheries centers, NOAA employees do not have administrative privileges enabled to edit the local environmental path. In this situation it is necessary to create a ticket with IT to add cmake and ninja to your PATH on Windows.
10.2.4 Setup for Linux and Mac users
See CMake installation instructions for installing CMake on other platforms. Add cmake to your PATH. You can check that the path is working by running
which cmake
in a command window.Download ninja v1.10.2 (ninja-win.zip) and put the binary in your preferred location. Add Ninja to your PATH. You can check that the path is working by running
which ninja
in a command window.Open a command window and type
cmake
. If you see usage, cmake is found. If not, cmake may still need to be added to your PATH.Open a command window and type
ninja
. If you see a message starting withninja:
, ninja is found. Otherwise, try changing the permissions or adding to your path.
10.2.5 How to edit your PATH and change file permissions for Linux and Mac
To check if the binary is in your path, assuming the binary is named
ninja: open a Terminal window and type which ninja
and hit enter. If
you get nothing returned, then ninja is not in your path. The easiest
way to fix this is to move the ninja binary to a folder that’s already
in your path. To find existing path folders type echo $PATH
in the
terminal and hit enter. Now move the ninja binary to one of these
folders. For example, in a Terminal window type:
sudo cp ~/Downloads/ninja /usr/bin/
To move ninja from the downloads folder to /usr/bin. You will need to
use sudo
and enter your password after to have permission to move a
file to a folder like /usr/bin/
.
Also note that you may need to add executable permissions to the ninja
binary after downloading it. You can do that by switching to the folder
where you placed the binary (cd /usr/bin/
if you followed the
instructions above), and running the command:
sudo chmod +x ninja
Check that ninja is now executable and in your path:
which ninja
If you followed the instructions above, you will see the following line returned:
/usr/bin/ninja
10.2.6 Set up FIMS testing project
Clone the FIMS repository on the command line using:
There is a file called CMakeLists.txt in the top level of the directory. This file instructs Cmake on how to create the build files, including setting up Google Test.
The Google Test testing code is in the tests/gtest subdirectory. Within this subdirectory is a file called CMakeLists.txt. This file contains additional specifications for CMake, in particular instructions on how to register the individual tests.
10.2.7 Build and run the tests
Three commands on the command line are needed to build the tests:
This generates the build system using Ninja as the generator. Note there is now a subfolder called build.
Next, in the same command window, use cmake to build in the build subfolder:
Finally, run the C++ tests:
The output from running the tests should look something like:
Internal ctest changing into directory: C:/github_repos/NOAA-FIMS_org/FIMS/build
Test project C:/github_repos/NOAA-FIMS_org/FIMS/build
Start 1: dlognorm.use_double_inputs
1/5 Test #1: dlognorm.use_double_inputs ....... Passed 0.04 sec
Start 2: dlognorm.use_int_inputs
2/5 Test #2: dlognorm.use_int_inputs .......... Passed 0.04 sec
Start 3: modelTest.eta
3/5 Test #3: modelTest.eta .................... Passed 0.04 sec
Start 4: modelTest.nll
4/5 Test #4: modelTest.nll .................... Passed 0.04 sec
Start 5: modelTest.evaluate
5/5 Test #5: modelTest.evaluate ............... Passed 0.04 sec
100% tests passed, 0 tests failed out of 5
10.2.8 Adding a C++ test
Create a file dlognorm.hpp within the src subfolder that contains a simple function:
#include <cmath>
template<class Type>
Type dlognorm(Type x, Type meanlog, Type sdlog){
Type resid = (log(x)-meanlog)/sdlog;
Type logres = -log(sqrt(2*M_PI)) - log(sdlog) - Type(0.5)*resid*resid - log(x);
return logres;
}
Then, create a test file dlognorm-unit.cpp in the tests/gtest subfolder that has a test suite for the dlognorm function:
#include "gtest/gtest.h"
#include "../../src/dlognorm.hpp"
// # R code that generates true values for the test
// dlnorm(1.0, 0.0, 1.0, TRUE) = -0.9189385
// dlnorm(5.0, 10.0, 2.5, TRUE) = -9.07679
namespace {
// TestSuiteName: dlognormTest; TestName: DoubleInput and IntInput
// Test dlognorm with double input values
TEST(dlognormTest, DoubleInput) {
EXPECT_NEAR( dlognorm(1.0, 0.0, 1.0) , -0.9189385 , 0.0001 );
EXPECT_NEAR( dlognorm(5.0, 10.0, 2.5) , -9.07679 , 0.0001 );
}
// Test dlognorm with integer input values
TEST(dlognormTest, IntInput) {
EXPECT_NEAR( dlognorm(1, 0, 1) , -0.9189385 , 0.0001 );
}
}
EXPECT_NEAR(val1, val2, absolute_error)
verifies that the difference
between val1
and val2
does not exceed the absolute error bound
absolute_error
. EXPECT_NE(val1, val2)
verifies that val1
is not
equal to val2
. Please see GoogleTest assertions
reference
for more EXPECT_
macros.
10.2.9 Add tests to tests/gtest/CMakeLists.txt
and run a binary
To build the code, add the following contents to the end of the tests/gtest/CMakeLists.txt
file:
add_executable(dlognorm_test
dlognorm-unit.cpp
)
target_include_directories(dlognorm_test
PUBLIC
${CMAKE_SOURCE_DIR}/../
)
target_link_libraries(dlognorm_test
gtest_main
)
include(GoogleTest)
gtest_discover_tests(dlognorm_test)
The above configuration enables testing in CMake, declares the C++ test binary you want to build (dlognorm_test), and links it to GoogleTest (gtest_main). Now you can build and run your test. Open a command window in the FIMS repo (if not already opened) and type:
This generates the build system using Ninja as the generator.
Next, in the same command window, use cmake to build:
Finally, run the tests in the same command window:
The output when running ctest
might look like this. Note there is a
failing test:
Internal ctest changing into directory: C:/Users/Kathryn.Doering/Documents/testing/FIMS/build
Test project C:/Users/Kathryn.Doering/Documents/testing/FIMS/build
Start 1: dlognorm.use_double_inputs
1/7 Test #1: dlognorm.use_double_inputs ....... Passed 0.04 sec
Start 2: dlognorm.use_int_inputs
2/7 Test #2: dlognorm.use_int_inputs .......... Passed 0.04 sec
Start 3: modelTest.eta
3/7 Test #3: modelTest.eta .................... Passed 0.04 sec
Start 4: modelTest.nll
4/7 Test #4: modelTest.nll .................... Passed 0.04 sec
Start 5: modelTest.evaluate
5/7 Test #5: modelTest.evaluate ............... Passed 0.04 sec
Start 6: dlognormTest.DoubleInput
6/7 Test #6: dlognormTest.DoubleInput ......... Passed 0.04 sec
Start 7: dlognormTest.IntInput
7/7 Test #7: dlognormTest.IntInput ............***Failed 0.04 sec
86% tests passed, 1 tests failed out of 7
Total Test time (real) = 0.28 sec
The following tests FAILED:
7 - dlognormTest.IntInput (Failed)
Errors while running CTest
Output from these tests are in: C:/Users/Kathryn.Doering/Documents/testing/FIMS/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
10.2.10 Debugging a C++ test
There are two ways to debug a C++ test, interactively using gdb
or via print statements. To use gdb
, make sure it is installed and on your path.
Debug C++ code (e.g., segmentation error/memory corruption) using gdb:
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Debug
cmake --build build --parallel 16
ctest --test-dir build --parallel 16
gdb ./build/tests/gtest/population_dynamics_population.exe
c // to continue without paging
run // to see which line of code is broken
print this->log_naa // for example, print this->log_naa to see the value of log_naa;
print i // for example, print i from the broken for loop
bt // backtrace
q // to quit
Debug C++ code without using gdb:
Update code in a .hpp file by calling std::ofstream out(“file_name.txt”)
Then use out << variable;
to print out values of the variable
More complex examples with text identifying the quantities
out <<" fleet_index: "<<fleet_index<<" index_yaf: "<<index_yaf<<" index_yf: "<<index_yf<<"\n";
out <<" population.Fmort[index_yf]: "<<population.Fmort[index_yf]<<"\n";
Git Bash
The output of the print statements will be in this test file: FIMS/build/tests/gtest/debug.txt
10.2.11 Benchmark example
Google Benchmark measures the real time and CPU time used for running the produced binary. We will continue using the dlognorm.hpp example. Create a benchmark file dlognorm_benchmark.cpp and put it in the tests/gtest subfolder:
#include "benchmark/benchmark.h"
#include "../../src/dlognorm.hpp"
void BM_dlgnorm(benchmark::State& state)
{
for (auto _ : state)
dlognorm(5.0, 10.0, 2.5);
}
BENCHMARK(BM_dlgnorm);
This file runs the dlognorm function and uses BENCHMARK to see how long it takes.
A more comprehensive feature overview of benchmarking is available in the Google Benchmark GitHub repository.
10.2.12 Add benchmarks to tests/gtest/CMakeLists.txt
and run the benchmark
To build the code, add the following contents to the end of your
tests/gtest/CMakeLists.txt
file:
FetchContent_Declare(
googlebenchmark
URL https://github.com/google/benchmark/archive/refs/tags/v1.6.0.zip
)
FetchContent_MakeAvailable(googlebenchmark)
add_executable(dlognorm_benchmark
dlognorm_benchmark.cpp
)
target_include_directories(dlognorm_benchmark
PUBLIC
${CMAKE_SOURCE_DIR}/../
)
target_link_libraries(dlognorm_benchmark
benchmark_main
)
To run the benchmark, open the command line open in the FIMS repo (if not already open) and run cmake, sending output to the build subfolder:
Then run the dlognorm_benchmark executable created:
The output from dlognorm_benchmark.exe
might look like this:
Run on (8 X 2112 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 256 KiB (x4)
L3 Unified 8192 KiB (x1)
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
BM_dlgnorm 153 ns 153 ns 4480000
10.2.12.1 Remove files produced by this example
If you don’t want to keep any of the files produced by this example and want to completely clear any uncommitted changes and files from the git repo, use
to get rid of un committed changes in git tracked files. To get rid of all untracked files in the repo, use:
10.2.13 Clean up after running C++ tests
10.2.13.1 Clean up CMake-generated files and re-run tests
After running the examples above, the build generates files (i.e., the
source code, libraries, and executables) and saves the files in the
build
subfolder. The example above demonstrates an “out-of-source”
build which puts generated files in a completely separate directory, so
that the source tree is unchanged after running tests. Using a separate
source and build tree reduces the need to delete files that differ
between builds. If you still would like to delete CMake-generated files,
just delete the build
folder, and then build and run tests by
repeating the commands below. The files from the build
folder are
included in the FIMS repository’s .gitignore file, so should not be
pushed to the FIMS repository.
10.2.13.2 Clean up individual tests
For simple C++ functions like the examples above, we do not need to clean up the tests. Clean up is only necessary in a few situations.
- If memory for an object was allocated during testing and not
deallocated - The object needs to be deleted (e.g.,
delete object
). - If you used a test fixture from GoogleTest to use the same data
configuration for multiple tests,
TearDown()
can be used to clean up the test and then the test fixture will be deleted. Please see more details from GoogleTest user’s guide.
10.3 Templates for GoogleTest testing
This section includes templates for creating unit tests and benchmarks. This is the code that would go into the .cpp files in tests/gtest.
10.3.3 tests/gtest/CMakeLists.txt template
These lines are added each time a new test suite (all tests in a file) is added:
// Add test suite 1
add_executable(TestSuiteName1
test1.cpp
)
target_link_libraries(TestSuiteName1
gtest_main
)
gtest_discover_tests(TestSuiteName1)
These lines are added each time a new benchmark file is added:
10.4 R testing
FIMS uses {testthat} for writing R tests. You can install the packages following the instructions on testthat website. If you are not familiar with testthat, the testing chapter in R packages gives a good overview of testing workflow, along with structure explanation and concrete examples.
10.4.1 Testing FIMS locally
To test FIMS R functions interactively and locally, use devtools::install()
rather than devtools::load_all()
. This is because using load_all()
will turn on the debugger, bloating the .o file, and may lead to a compilation error (e.e., Fatal error: can't write 326 bytes to section .text of FIMS.o: 'file too big' as: FIMS.o: too many sections (35851)
). Note that useful interactive tests should should be converted into {testthat} or googletest tests.
10.4.2 Testing using gdbsource
You can interactively debug C++ code using TMB::gdbsource()
in RStudio. Just add these two lines to the top of the test-fims-estimation.R file
10.4.3 R testthat naming conventions and file organization
- We try to group functions and their helpers together (the “main function plus helpers” approach)
- Always name the test file the same as the R file, but with test- prepended (ex,
test-myfunction.R
contains testthat tests for the R code inR/myfunction.R
). This is the convention in the tidyverse style guide. - testthat tests that are a test of rcpp should be called
test-rcpp-[description].R
- Integration tests which do not have a corresponding .R file should use the convention
test-integration-[description].R
.
10.5 Test case documentation template and examples
A testing plan must be developed while designing (i.e., before coding) new FIMS features or Rcpp modules. Please update the test cases in the FIMS/tests/milestoneX_test_cases.md file (e.g., FIMS/tests/miletone1_test_cases.md). This testing plan is documented using the test case documentation template below.
10.5.1 Test case documentation template
Individual functional or integration test cases will be designed following the template below.
Test ID. Create a meaningful name for the test case.
Features to be tested. Provide a brief statement of test objectives and description of the features to be tested. (Identify the test items following the FIMS software design specification document and identify all features that will not be tested and the rationale for exclusion)
Approach. Specify the approach that will ensure that the features are adequately tested and specify which type of test is used in this case.
Evaluation criteria. Provide a list of expected results and acceptance criteria.
Pass/fail criteria. Specify the criteria used to determine whether each feature has passed or failed testing.
In addition to setting pass/fail criteria with specific tolerance values, a documentation that just views the outputs of some tests may be useful if the tests require additional computations, simulations, and comparisons
Test deliverables. Identify all information that is to be delivered by the test activity.
- Test logs and automated status reports
10.5.2 Test case documentation examples
10.5.2.1 General test case documentation
The test case documentation below is a general case to apply to many functions/modules. For individual functions/modules, please make detailed test cases for specific options, noting “same as the general test case” where appropriate.
10.5.2.2 Functional test example: TMB probability mass function of the multinomial distribution
Test ID | Probability mass function of the multinomial distribution |
---|---|
Features to be tested |
|
Approach | Functional test
|
Evaluation Criteria |
|
Test deliverables |
|
10.5.2.3 Integration test example: Li et al. 2021 age-structured stock assessment model comparison
Test ID | Age-structured stock assessment comparison (Li et al. 2021) |
---|---|
Features to be tested |
|
Approach | Integration test
|
Evaluation Criteria |
|
Test deliverables |
|
10.5.2.4 simulation testing: challenges and solutions
One thing that might be challenging for comparing simulation results is that changes to the order of calls to simulate will change the simulated values. Tests may fail even though it is just because different random numbers are used or the order of the simulation changes through model development. Several solutions could be used to address the simulation testing issue. Please see discussions on the FIMS-planning issue page for details.
- Once we start developing simulation modules,we can use these two ways
to compare simulated data from FIMS and a test:
- Add a TRUE/FALSE parameter in each FIMS simulation module for setting up testing seed. When testing the module, set the paramter to TRUE to fix the seed number in R and conduct tests.
- If adding a TRUE/FALSE parameter does not work as expected, then carefully check simulated data from each component and make sure it is not a model coding error.
- FIMS will use
set.seed()
from R to set the seed. The {rstream} package will be investigated if one of the requirements of FIMS simulation module is to generate multiple streams of random numbers to associate distinct streams of random numbers with different sources of randomness. {rstream} was specifically designed to address the issue of needing very long streams of pseudo-random numbers for parallel computations. Please see rstream paper and RngStreams for more details.