class: center, middle, inverse, title-slide # Uncertainty and Bias Correction ## TMB Training Session II ### Andrea Havron
NOAA Fisheries, OST --- layout: true .footnote[U.S. Department of Commerce | National Oceanic and Atmospheric Administration | National Marine Fisheries Service] <style type="text/css"> code.cpp{ font-size: 14px; } code.r{ font-size: 14px; } </style> --- # Spatial Model via GLMM <br> .three-column[ For a set of species weights `\begin{align} y_{i} &\sim f(g^{-1}(\eta_{i})) \\ \eta_{i} &= \beta_{0} + \omega_{i} \\ \omega &\sim MVN(0,\Sigma) \\ &\\ L(\omega) &= \frac{|\Sigma|^{-1/2}}{\sqrt{(2\pi)^{n}}}exp\Big(-\frac{1}{2}\omega^{T}\Sigma^{-1}\omega\Big)\\ &\\ \Sigma &= \sigma^{2}_{\omega}matern(\phi, \nu=1) \end{align}` ] .three-column[ <img src="data:image/png;base64,#static/MatCovfun1.png" width="80%" style="display: block; margin: auto;" /> ] .three-column[ <img src="data:image/png;base64,#static/spatial-map.png" width="85%" style="display: block; margin: auto auto auto 0;" /> ] --- # Tweedie distributed data <br> .three-column[ For a set of species weights `\begin{align} y_{i} &\sim Tweedie(\mu_{i}, \phi, power) \\ \mu_{i} &= exp(\eta_{i})\\ \eta_{i} &= \beta_{0} + \omega_{i} \\ \omega &\sim MVN(0,\Sigma) \\ &\\ L(\omega) &= \frac{|\Sigma|^{-1/2}}{\sqrt{(2\pi)^{n}}}exp\Big(-\frac{1}{2}\omega^{T}\Sigma^{-1}\omega\Big)\\ &\\ \Sigma &= \sigma^{2}_{\omega}matern(\phi, \nu=1) \end{align}` ] .three-column[ <img src="data:image/png;base64,#static/tweedie-hist.png" width="80%" style="display: block; margin: auto 0 auto auto;" /> ] .three-column[ <img src="data:image/png;base64,#static/tweedie-map.png" width="80%" style="display: block; margin: auto auto auto 0;" /> ] --- #Estimating Uncertainty<br><br> Uncertainty estimates can be calculated in several ways with TMB * Asymptotic approximation + delta method ([TMB::sdreport](https://rdrr.io/cran/TMB/man/sdreport.html)) - Asymptotics: <br> `\(Var[\theta] \approx -\frac{1}{l''(\theta)}\)` <br> - Delta Method: <br> `\(g(\theta)\)`: some derived function of $\theta <br> `\(Var[g(\theta)] \approx g'(\theta)^2var(\theta)\)` * Likelihood profiles ([TMB::tmbprofile](https://rdrr.io/cran/TMB/man/tmbprofile.html)) --- #TMB::sdreport #### [TMB::as.list.sdreport](https://rdrr.io/cran/TMB/man/as.list.sdreport.html) .pull-left[ ```r sdr <- sdreport(obj) df <- data.frame( omega.est = as.list(sdr, "Estimate")$omega, omega.se = as.list(sdr, "Std. Error")$omega, conf.lower = omega.est - qnorm(.975)*omega.se, conf.upper = omega.est + qnorm(.975)*omega.se) ggplot(df, aes(x = 1:400, y = omega.est)) + geom_point() + geom_ribbon(aes(x = 1:400, ymin = conf.lower, ymax = conf.upper, fill = "band"), alpha = 0.3) + scale_fill_manual("", values = "grey20") + ylab("omega") + xlab("") + theme_classic() + theme(legend.position = "none") ``` ] .pull-right[ <img src="data:image/png;base64,#static/omega-confint.png" width="75%" style="display: block; margin: auto auto auto 0;" /> ] --- # Lkelihood Profiles <br> .pull-left[ * Does not make the asymptotic normal assumption * Refit model at series of fixed values for parameter of interest * Calculate likelihood ratio tests between optimum and re-evaluated likelihoods * The 95% LRCI is the point at which the log-likelihoods decreases by 1.92 * A good (but slower) approach that works better when likelihood profile is asymmetric `\begin{align} LRT &\sim \chi^2_{df=1, 0.95}\\ \chi^{2}_{1,0.95} &\approx 3.84\\ 3.84/2 &= 1.92 \end{align}` ] .pull-right[ <img src="data:image/png;base64,#static/millar-profile.png" width="100%" style="display: block; margin: auto auto auto 0;" /> .small[source: Millar, R.B. 2011. Maximum likelihood estimation and inference, pg. 59] ] --- # Likelihood Profiles <br> .pull-left[ * Easy to do in TMB: ```r #profile beta0 prof <- TMB::tmbprofile(obj, "ln_sigma2") plot(prof) confint(prof) ``` * plot(prof) and confint(prof) plot and give 95% range * This interval may not be symmmetic nor finite ] .pull-right[ <img src="data:image/png;base64,#static/ln_sigma2_profile.png" width="100%" style="display: block; margin: auto;" /> ] --- #Checking the Laplace Approximation * For mixed effects models, TMB allows one to test whether the the Laplace approximation of the marginal log-likelihood and joint log-likelihood is ok. `\begin{align} E [\nabla_{θ}f(y,u;\theta)] = 0\\ E [\nabla_{θ} \int f(y,u))du ] = 0 \end{align}` * Simulate data and calculate gradients `\((\nabla_{\theta})\)` for each simulated data set * Average gradient should be `\(0\)` when the Laplace approximation is ok ```r check <- TMB::checkConsistency(obj) summary(check) ``` * performs `\(\chi^2\)` test for gradient bias for marginal and joint * provides estimates of bias for fixed effects parameters * increasing sample sizes will increase power of `\(\chi^2\)` test for (small) bias