A review of statistical computing with TMB

# A review of statistical computing with TMB
## TMB Training Session I
### Andrea Havron NOAA Fisheries, OST

---

.footnote[U.S. Department of Commerce | National Oceanic and Atmospheric Administration | National Marine Fisheries Service]

code.cpp{
  font-size: 14px;
}
code.r{
  font-size: 14px;
}

</style>

---
# Where to get everything

.column-60[
<img src="data:image/png;base64,#static/tmb_training_screenshot.png" width="100%" style="display: block; margin: auto auto auto 0;" />
]

.column-40[
All slides and code are at: [NOAA-FIMS/TMB_training](https://github.com/NOAA-FIMS/TMB_training)
<img src="data:image/png;base64,#static/SessionI_screenshot.png" width="90%" style="display: block; margin: auto auto auto 0;" />
]
---
# Session I Agenda

**Day 1**: 
- Review of statistical computing with TMB
- TMB Basics with Linear Regression live coding example
- Debugging

**Day 2**: 
- Dealing with parameters
- Modular TMB with Multinomial live coding example
- Debugging

---
# Statistical Computing Review

---
# What is AD?

.three-column-left[
**Automatic Differentiation** 
Derivatives calculated automatically using the chain rule 
.p[
- Efficient: forward mode, O(n); reverse mode, O(m)
- Accurate 
- Higher order derivatives: easy
]]
.three-column-left[
**Symbolic Differentiation** 
Computer program converts function into exact derivative function 
.p[
- Inefficient: O(2n) for n input variables
- Exact 
- Higher order derivatives: difficult due to complexity
]]
.three-column-left[
**Numerical Differentiation** 
Approximation that relies on finite differences
.p[
- Efficient: O(n) 
- Trade-off between truncation error versus round-off error
- Higher order derivatives calculation difficult due to error accumulation
]]

---
# Computational Graph (Tape)
 
.three-column-left[

```cpp
//Program
v1: x = ?
v2: y = ?
v3: a = x * y
v4: b = sin(y)
v5: z = a + b
```
]
.three-column[
<img src="data:image/png;base64,#static/comp-graph.png" width="100%" style="display: block; margin: auto auto auto 0;" />
]
.three-column-left[

```cpp
//Reverse Mode
dz = ?
da = dz
db = dz
dx = yda
dy = xda + cos(y)db
```
]

---
# Reverse Mode
.pull-left[
**Static (TMB: CppAD, TMBad)** 
The graph is constructed once before execution
.p[
- Less flexibility with conditional statements that depend on parameters. 
- Atomic functions can be used when conditional statements depend on parameters
- High portability 
- Graph optimization possible
]]

.pull-right[
**Dynamic (Stan: Stan Math Library, ADMB: AUTODIF)** 
The graph is defined as forward computation is executed at every iteration
.p[
- Flexibility with conditional statements
- Optimization routine implemented into executable
- Less time for graph optimization
]
]

---

# Type Systems in R and C++
---
# Dynamic vs. Static Typing
 
.pull-left[
**R: Dynamic**
.p[
- Type checking occurs at run time
- The values and types associated with names can change
- Change in type tends to be implicit
]]

.pull-right[
**C++: Static**
.p[
- Type checking occurs at compile time
- The values associated with a given name can be limited to just a few types and may be unchangeable
- Change in type tends to be explicit
]]

---
# Changing Type Declaration
 
.pull-left[
**R: Dynamic**

```r
a <- 1
a <- "hello"
a <- function(x) x^2
a <- environment()
```

```r
`+` <- `-`
1 + 1
```

```
## [1] 0
```

]

```cpp
#include <iostream>
#include <string>

int main() {
    double a = 1.1;
    std::string a = "hello";
    return 0;
}
```

```bash
g++ ../src/lec1a.cpp -o a.exe
```
error in engine(options) : lec1a.cpp: In function 'int main()': lec1a.cpp:7:17: error: **conflicting declaration**
]

---
# Explicit Type Conversion
 
.pull-left[
**R: Dynamic**

```r
a <- 1
b <- "hello"
a + b
```
Error in a + b : non-numeric argument to binary operator

```r
a <- 1
b <- "hello"
c <- as.numeric(as.factor(b))
a + c
```

```
## [1] 2
```
]

```cpp
#include <iostream>
#include <string>

int main() {
 int x = 1;
 std::string a = "a";
 std::string b = std::to_string(x);
 std::cout << a + b;
 return 0;
}
```

```bash
g++ ../src/lec1b.cpp -o b.exe
b.exe
```
a1
]
---
# Implicit Type Conversion
.pull-left[
**R: Dynamic**

```r
a <- 1
b <- "hello"
c(a,b)
```

```
## [1] "1"     "hello"
```
]

```cpp
#include <iostream>
#include <string>

int main() {
 double x = 1.1;
 int y = x;
 std::cout << "x = " << x << "; y = " << y << std::endl;
 return 0;
}
```

```bash
g++ ../src/lec1c.cpp -o c.exe
c.exe
```
x = 1.1; y = 1 
]

---
# What is Templated C++?
 
* Generic programming
* Allows developer to write functions and classes that are independent of Type
* Templates are expanded at compile time

```cpp

template <class T>
 T add(T x, T y){
 return x + y;
}

int main(){
  int a = 1;
  int b = 2;
  double c = 1.1;
  double d = 2.1;
  int d = add(a,b);
  double e = add(c,d);
}

```
]
.pull-right[

```cpp
int add(int x, int y){
  return x + y;
}
double add(double x, double y){
  return x + y;
}
```
]

---
# Setting up Templated C++

```cpp
template <class T>
T add(T x, T y){
 return x + y;
}
```

```cpp
template <typename Type>
Type add(Type x, Type y){
 return x + y;
}
```

---
# TMB AD Systems
 
.pull-left[
**CppAD**
- [CppAD package](https://coin-or.github.io/CppAD/doc/cppad.htm)
]
.pull-right[
**TMBad** 
- TMBad is available with TMB 1.8.0 and higher
]

---
# TMB AD Types

**CppAD** 
From [FIMS/inst/include/def.hpp](https://github.com/NOAA-FIMS/FIMS/blob/24458ce4cae439dbb4917a013a35cac5cc11b592/inst/include/common/def.hpp#L17)

```cpp
#ifdef TMB_MODEL
// simplify access to singletons
#define TMB_FIMS_REAL_TYPE double
#define TMB_FIMS_FIRST_ORDER AD<TMB_FIMS_REAL_TYPE>
#define TMB_FIMS_SECOND_ORDER AD<TMB_FIMS_FIRST_ORDER>
#define TMB_FIMS_THIRD_ORDER AD<TMB_FIMS_SECOND_ORDER>
#endif
```

def |CppAD Type |Value |Used to evaluate
-----------------------------------------------|----------------------------|-----------------------------------------
TMB_FIMS_REAL_TYPE double | < double > | likelihood |
TMB_FIMS_FIRST_ORDER AD<TMB_FIMS_REAL_TYPE> | AD < double > | 1st derivative | MLE
TMB_FIMS_SECOND_ORDER AD<TMB_FIMS_FIRST_ORDER> | AD< AD < double > > | 2nd derivative | Variance
TMB_FIMS_THIRD_ORDER AD<TMB_FIMS_SECOND_ORDER> | AD< AD< AD < double > > > | 3rd derivative | Laplace approximation

---
# TMB AD Types

```cpp
#ifdef TMB_MODEL
#ifdef TMBAD_FRAMEWORK
#define TMBAD_TYPE TMBad::ad_aug 
#endif
#endif
```
]

**TMBad is available with TMB 1.8.0 and higher**
---
class: middle

# Likelihood Review
---
# ML Inference