Uncertainty and Bias Correction

class: center, middle, inverse, title-slide

# Uncertainty and Bias Correction
## TMB Training Session II
### Andrea Havron NOAA Fisheries, OST

---

layout: true

.footnote[U.S. Department of Commerce | National Oceanic and Atmospheric Administration | National Marine Fisheries Service]

code.cpp{
  font-size: 14px;
}
code.r{
  font-size: 14px;
}

</style>

---
# Spatial Model via GLMM

.three-column[
For a set of species weights
`\begin{align}
  y_{i} &\sim f(g^{-1}(\eta_{i})) \\
  \eta_{i} &= \beta_{0} + \omega_{i} \\
  \omega &\sim MVN(0,\Sigma) \\
  &\\
  L(\omega) &= \frac{|\Sigma|^{-1/2}}{\sqrt{(2\pi)^{n}}}exp\Big(-\frac{1}{2}\omega^{T}\Sigma^{-1}\omega\Big)\\
  &\\
  \Sigma &= \sigma^{2}_{\omega}matern(\phi, \nu=1)
\end{align}`
]

.three-column[
<img src="data:image/png;base64,#static/MatCovfun1.png" width="80%" style="display: block; margin: auto;" />

]

.three-column[
<img src="data:image/png;base64,#static/spatial-map.png" width="85%" style="display: block; margin: auto auto auto 0;" />

]
---
# Tweedie distributed data

.three-column[
For a set of species weights
`\begin{align}
  y_{i} &\sim Tweedie(\mu_{i}, \phi, power) \\
  \mu_{i} &= exp(\eta_{i})\\
  \eta_{i} &= \beta_{0} + \omega_{i} \\
  \omega &\sim MVN(0,\Sigma) \\
  &\\
  L(\omega) &= \frac{|\Sigma|^{-1/2}}{\sqrt{(2\pi)^{n}}}exp\Big(-\frac{1}{2}\omega^{T}\Sigma^{-1}\omega\Big)\\
  &\\
  \Sigma &= \sigma^{2}_{\omega}matern(\phi, \nu=1)
\end{align}`
]

.three-column[
<img src="data:image/png;base64,#static/tweedie-hist.png" width="80%" style="display: block; margin: auto 0 auto auto;" />
]
.three-column[
<img src="data:image/png;base64,#static/tweedie-map.png" width="80%" style="display: block; margin: auto auto auto 0;" />

]
---
#Estimating Uncertainty

Uncertainty estimates can be calculated in several ways with TMB
 * Asymptotic approximation + delta method ([TMB::sdreport](https://rdrr.io/cran/TMB/man/sdreport.html))
 - Asymptotics: 
 `$Var[\theta] \approx -\frac{1}{l''(\theta)}$`
 
 - Delta Method: 
 `$g(\theta)$`: some derived function of $\theta 
 `$Var[g(\theta)] \approx g'(\theta)^2var(\theta)$`

* Likelihood profiles ([TMB::tmbprofile](https://rdrr.io/cran/TMB/man/tmbprofile.html))

---
#TMB::sdreport
#### [TMB::as.list.sdreport](https://rdrr.io/cran/TMB/man/as.list.sdreport.html)

.pull-left[

```r
sdr <- sdreport(obj)
df <- data.frame(
 omega.est = as.list(sdr, "Estimate")$omega,
 omega.se = as.list(sdr, "Std. Error")$omega,
 conf.lower = omega.est - qnorm(.975)*omega.se,
 conf.upper = omega.est + qnorm(.975)*omega.se)
ggplot(df, aes(x = 1:400, y = omega.est)) +
 geom_point() +
 geom_ribbon(aes(x = 1:400, ymin = conf.lower, 
 ymax = conf.upper, fill = "band"), 
 alpha = 0.3) +
 scale_fill_manual("", values = "grey20") +
 ylab("omega") + xlab("") +
 theme_classic() + 
 theme(legend.position = "none")
```
]

.pull-right[
<img src="data:image/png;base64,#static/omega-confint.png" width="75%" style="display: block; margin: auto auto auto 0;" />

]
---
# Lkelihood Profiles
 
.pull-left[
* Does not make the asymptotic normal assumption
* Refit model at series of fixed values for parameter of interest 
* Calculate likelihood ratio tests between optimum and re-evaluated likelihoods
* The 95% LRCI is the point at which the log-likelihoods decreases by 1.92
* A good (but slower) approach that works better when likelihood profile is asymmetric
`\begin{align}
LRT &\sim \chi^2_{df=1, 0.95}\\ 
\chi^{2}_{1,0.95} &\approx 3.84\\
3.84/2 &= 1.92
\end{align}`
]
.pull-right[
<img src="data:image/png;base64,#static/millar-profile.png" width="100%" style="display: block; margin: auto auto auto 0;" />
.small[source: Millar, R.B. 2011. Maximum likelihood estimation and inference, pg. 59]
]

---
# Likelihood Profiles
 
.pull-left[
* Easy to do in TMB:

```r
#profile beta0
prof <- TMB::tmbprofile(obj, "ln_sigma2")
plot(prof)
confint(prof)
```
* plot(prof) and confint(prof) plot and give 95% range
* This interval may not be symmmetic nor finite
]
.pull-right[
<img src="data:image/png;base64,#static/ln_sigma2_profile.png" width="100%" style="display: block; margin: auto;" />
]

---

#Checking the Laplace Approximation
* For mixed effects models, TMB allows one to test whether the the Laplace approximation of the marginal log-likelihood and joint log-likelihood is ok.

`\begin{align}
E [\nabla_{θ}f(y,u;\theta)] = 0\\
E [\nabla_{θ}  \int f(y,u))du ] = 0
\end{align}`
* Simulate data and calculate gradients `$(\nabla_{\theta})$` for each simulated data set
* Average gradient should be `$0$` when the Laplace approximation is ok

```r
check <- TMB::checkConsistency(obj)
summary(check)
```
* performs `$\chi^2$` test for gradient bias for marginal and joint
* provides estimates of bias for fixed effects parameters
* increasing sample sizes will increase power of `$\chi^2$` test for (small) bias