class: center, middle, inverse, title-slide # TMB Model Validation ## TMB Training Session II ### Andrea Havron
NOAA Fisheries, OST --- layout: true .footnote[U.S. Department of Commerce | National Oceanic and Atmospheric Administration | National Marine Fisheries Service] <style type="text/css"> code.cpp{ font-size: 14px; } code.r{ font-size: 14px; } .remark-slide-content > h1 { font-size: 40px; } .remark-slide-content > h3 { font-size: 25px; text-align: left !important; margin-left: 235px } </style> --- # Pearson residuals <br><br> .pull-left-narrow[ `$$r_{i} = \frac{Y_{i} - E[Y_{i}]}{\sqrt{Var[Y_{i}]}}$$` ] .pull-right-wide[ **Latent Variable Model Issues**: - Assumes the mean-variance relationship is equal to 1 - Correctly specified models invalidated more than expectecd - Difficult to assess misspecified variance structure (ie. overdispersion, heteroscadacity) ] --- # Quantile Residuals .pull-left[ <br> `\(r_i = \phi^{-1}\big(F(Y_i,\Theta)\big)\)` <br><br> - `\(F(Y,\Theta)\)`: cdf of `\(f(y,\Theta)\)` - `\(\phi^{-1}\)`: cdf inverse of the standard normal distribution <br><br> ] .pull-right[ <img src="data:image/png;base64,#static/dgamma2dnorm.png" width="80%" style="display: block; margin: auto;" /> ] <br> Dunn & Smyth, 1996 --- # Quantile Residuals and GLMMs <br> `$$L(\theta) = \int_{\mathbb{R}}f(y;u,\theta)f(u;\theta)du$$` .large[ .p[ Considerations: 1. Closed form solution to cdf not always available 2. Need to account for correlation structure in residuals ] ] --- # Approximations to the quantile resdiual <br> .p[ .pull-left[ **TMB residual methods** - Analytical Calculations - Laplace approximations to the cdf - Bayesian simulations - Methods: Full gaussian, one-step gaussian, one-step generic, cdf, MCMC via tmbstan <br><br> Kristensen et al., 2016 <br> Thygesen, U. et al., 2017 <br> Monnahan & Kristensen, 2018 ] .pull-right[ **DHARMa** - Simulation-based approach - Emiprical cdf - Methods: Conditional, Unconditional <br><br><br><br> Hartig, F., 2022 ] ] --- # Two levels of residual calculation <br> **Unconditional residuals ** - Validate the data and random effect model simultaneously **Conditional residuals** - Validate the data model conditional on random effect MLEs --- # TMB Methods <br><br> |Method |Definition | |:---------------|:------------------------------------------------------------------------------------| |FullGaussian |Best approach when data and random effects are both normally distributed | |oneStepGaussian |Most efficient one-step method when data and random effects are approximately normal | |cdf |One-step method that does not require normality but does require a closed form cdf | |oneStepGeneric |One-step method useful when no closed form cdf but slow | * The cdf and oneStepGeneric methods are the only methods available for discrete data: randomized quantile residuals * oneStepGeneric is dependent on Laplace approximations of likelihoods, so probably important to use TMB::checkConsistency. * OSA residuals are challenging for some distributions (eg. multinomial, delta-lognormal, tweedie). --- # Full Gaussian Method<br> Analytical calculation that assumes joint distribution of data and random effects is Gaussian. Applies rotation to transform to univariate space. .pull-left-wide[ **Correlated MNV** <img src="data:image/png;base64,#static/demo_pairs1.png" width="75%" style="display: block; margin: auto;" /> ] .pull-right-narrow[ <br><br> `\begin{align} y &\sim MVN(0, \Sigma) \\ \end{align}` ] --- # Full Gaussian Method<br> Analytical calculation that assumes joint distribution of data and random effects is Gaussian. Applies rotation to transform to univariate space. .pull-left-wide[ **Scale to unit variance** <img src="data:image/png;base64,#static/demo_pairs2.png" width="75%" style="display: block; margin: auto;" /> ] .pull-right-narrow[ <br><br> `\begin{align} y &\sim MVN(0, \Sigma) \\ r &= y / sqrt(diag(\Sigma)) \end{align}` ] --- # Full Gaussian Method<br> Analytical calculation that assumes joint distribution of data and random effects is Gaussian. Applies rotation to transform to univariate space. .pull-left-wide[ **Cholesky Transformation** <img src="data:image/png;base64,#static/demo_pairs3.png" width="75%" style="display: block; margin: auto;" /> ] .pull-right-narrow[ <br><br> `\begin{align} y &\sim MVN(0, \Sigma) \\ L &= cholesky(\Sigma)^{T} \\ r &= L^{-1}y \end{align}` ] --- # TMB One-Step Methods <br> Steps through the data calculating quantile residuals for observation `\(i\)` given all previous observations: `\(r_{i} = \phi^{-1}(F(Y_{i}|Y_{1},\ldots, Y_{i-1},\Theta))\)`<br> `\(\phi^{-1}\)`: inverse cdf of the standard normal<br> `\(F(Y,\Theta)\)`: cdf of `\(f(Y,\Theta)\)` * These residuals are independent standard normal when the model is appropriate for the data * For non-random effects models these are equivalent to more familiar quantile residuals used with GLMs because `\(F(Y_{i}|Y_{1},\ldots, Y_{i-1},\Theta) = F(Y_{i}|\Theta)\)` |Method |Definition | |:---------------|:------------------------------------------------------------------------------------| |oneStepGaussian |Most efficient one-step method when data and random effects are approximately normal | |cdf |One-step method that does not require normality but does require a closed form cdf | |oneStepGeneric |One-step method useful when no closed form cdf but slow | --- # TMB One-Step Methods <br> C++ code needs to be modified: ```cpp DATA_VECTOR(y); DATA_VECTOR_INDICATOR(keep, y); ... for(int i=0; i < N.size(); i++){ nll -= keep[i] * dpois(y[i], exp(eta[i]), true); /* For cdf OSA residuals only: */ Type cdf = squeeze( ppois(y[i], exp(eta[i])) ); nll -= keep.cdf_lower[i] * log( cdf ); // NaN protected nll -= keep.cdf_upper[i] * log( 1.0 - cdf ); // NaN protected } ``` --- # TMB MCMC Method<br> Bayesian simulation approach .pull-left[ **Simplified algorithm** .p[ 1. Map fixed parameters to their MLEs 2. Create a new object 3. Draw a single posterior from an MCMC chain 4. Use the posterior random vector to recalculate the expected value and plug into cdf calculations 5. Relies on conditional independence rather than rotation ] ] .pull-right[ <br> ```r obj2 <- MakeADFun(data, MLEs, map) fit <- tmbstan(obj2, iter = warmup+1) sample <- extract(fit)$u mu <- beta0 + u r <- qnorm(y, mu, sd) ``` ] --- # DHARMa method .pull-left[ <br><br> |Method |Definition | |:------------------|:------------------------------------------------------------------| |Conditional ecdf |Simulate new observations conditional on the fitted random effects | |Unconditional ecdf |Simulate new observations given new simulated random effects | ] .pull-right[ <img src="data:image/png;base64,#static/dharma-ecdf.png" width="100%" style="display: block; margin: auto;" /> .small[source: Florian Hartig, DHARMa package] ] --- #Simulation Study<br> <br><br> .p[ For each iteration(i=500): 1. Data (n=100) were simulated. 2. Data were fit to the correctly specified (h0) and mis-specified (h1) models 3. TMB and DHARMa residuals were calculated for h0 and h1. 4. Step 3 was repeated using true parameter values to calculate theoretical residuals. 5. GOF p-values were calculated using the Kolmogorov-Smirnov test. ] Distribution of GOF p-values: - Uniform under simulation of the correctly specified model - Skewed towards 1 under simulation of mis-specified model --- # Linear Regression example<br> When models do not contain random effects, quantile residual approximations from a normal model should collapse to Pearson .pull-left[ `\begin{align} \mu &= X\beta\\ y &\sim N(\mu, \sigma) \end{align}` ] .pull-right[ <img src="data:image/png;base64,#static/lr-dharma.png" width="100%" style="display: block; margin: auto;" /><img src="data:image/png;base64,#static/lr-osa.png" width="100%" style="display: block; margin: auto;" /> ] --- # Linear Regression example<br> When models do not contain random effects, quantile residual approximations from a normal model should collapse to Pearson .pull-left[ **Mis-specified model**:<br> `\begin{align} OM:&\\ y &= X\beta + \epsilon\\ \epsilon &\sim LN(0,1)\\ EM:&\\ y &= X\beta + \epsilon\\ \epsilon &\sim N(0,1) \end{align}` ] .pull-right[ <img src="data:image/png;base64,#static/lr-dharma-h1.png" width="100%" style="display: block; margin: auto;" /><img src="data:image/png;base64,#static/lr-osa-h1.png" width="100%" style="display: block; margin: auto;" /> ] --- #Random walk example <br> .p[ .pull-left-extra-narrow[ **Model**: `\begin{align} \mu_{i} &= u_{i-1} + a \\ u_{i} &\sim N(\mu_{i},\tau) \\ y_{i} &\sim N(u_{i}, \sigma) \end{align}` ] .pull-right-extra-wide[ .pull-left-narrow[ **Parameters**: `\begin{align} a &= 0.75 \\ \tau &= 1 \\ \sigma &= 1 \end{align}` ] .pull-right-wide[ **Mispecification**: 1. Data simulated with drift term, a 2. Data fit to model without drift term ] ] ] --- # Unconditional Level .large[**Validate the data and random effect model simultaneously**] <br> .pull-left-narrow[ **TMB** - **Full Gaussian** ] .pull-right-wide[ <br><br> Analytical calculation of the quantile residual of a hierarchical model `\(y,u \sim MVN(\mu, \Sigma)\)` ```r L <- t(chol(Sigma)) mode <- y - mu r <- solve(L, mode) #TMB model obj <- MakeADFun(data, parameters, random = "u") opt <- nlminb(obj$par, obj$fn, obj$gr) #TMB residual calculation oneStepPredict(obj, observation.name = "y", method = "fullGaussian") ``` ] --- # Unconditional Level .large[**Validate the data and random effect model simultaneously**] <br> .pull-left-narrow[ **TMB** - Full Gaussian method - **MCMC** ] .pull-right-wide[ <br><br> Draw random effects from a posterior MCMC chain ```r #TMB model obj <- MakeADFun(data, parameters, random = "u") opt <- nlminb(obj$par, obj$fn, obj$gr) #Fix MLES and draw one posterior obj2 <- MakeADFun(..., map = map) sample <- extract( tmbstan(obj2, iter = warmup+1) )$u #residuals using posterior u Fy <- pnorm(y, u, sigma) r <- qnorm(Fy) ``` ] --- # Unconditional Level .large[**Validate the data and random effect model simultaneously**] <br> .pull-left-narrow[ TMB - Full Gaussian method - MCMC **DHARMa** - **Unconditional** ] .pull-right-wide[ <br><br> Empircal cdf approximation using simulated random effects and rotated data (optional) ```r #Simulate new random effects and data use MLEs u.sim <- rnorm(mu, tau) y.sim <- rnorm(u.sim, sigma) #cholesky decomposition based on estimated Sigma L <- t(chol(Sigma)) #rotate simulated data and observations y.sim <- apply(y.sim, 2, function(x) solve(L,x)) y <- solve(L, y) #empirical cdf on rotated data and simulation r <- ecdf(y.sim)(y) ``` ] --- # Unconditional Level .large[**Validate the data and random effect model simultaneously**] <br> .pull-left-narrow[ TMB - Full Gaussian method - MCMC **DHARMa** - **Unconditional** ] .pull-right-wide[ <br><br> Empircal cdf approximation using simulated random effects and rotated data (optional) ```r #TMB model obj <- MakeADFun(data, parameters, random = "u") opt <- nlminb(obj$par, obj$fn, obj$gr) obj$env$data$simRE <- 1 y.sim <- obj$simulate()$y #DHARMa residual function: createDHARMa(y.sim, y, rotation = "estimated") ``` ] --- # Results: Unconditional Level .pull-left[ Theoretical residual p-values ![](data:image/png;base64,#TMBvalidation_files/figure-html/unnamed-chunk-18-1.png)<!-- --> Type I error: <table> <tbody> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.042 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.054 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.054 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.99 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> </tr> </tbody> </table> .small[ **Significance level**: <br>0.05 <br><br> **Type I error**: <br> Probability of falsely rejecting the correctly specified model <br><br> **Power**: <br> Probabilty of correctly rejecting the mis-specified model ] ] .pull-right[ Estimated residual p-values ![](data:image/png;base64,#TMBvalidation_files/figure-html/unnamed-chunk-20-1.png)<!-- --> <table> <tbody> <tr> <td style="text-align:right;background-color: #FFFFFF !important;"> Type I </td> <td style="text-align:right;background-color: #FFFFFF !important;"> Error </td> <td style="text-align:right;background-color: #FFFFFF !important;"> </td> <td style="text-align:right;background-color: #FFFFFF !important;"> </td> </tr> <tr> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.050 </td> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.996 </td> </tr> <tr> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> Power </td> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> </td> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> </td> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> </td> </tr> <tr> <td style="text-align:right;background-color: #FFFFFF !important;"> 1.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;"> 0.052 </td> <td style="text-align:right;background-color: #FFFFFF !important;"> 1.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;"> 1.000 </td> </tr> </tbody> </table> ] --- # Conditional Level .large[**Validate the data model conditional on random effect MLEs**] <br> .pull-left-narrow[ **TMB** - **One-step ahead** - gaussian - cdf - generic ] .pull-right-wide[ ```r #TMB model obj <- MakeADFun(data, parameters, random = "u") opt <- nlminb(obj$par, obj$fn, obj$gr) #TMB residual calculation for continuous data oneStepPredict(obj, observation.name = "y", data.term.indicator = "keep" method = "oneStepGaussian") #TMB residual calculation for discrete data oneStepPredict(obj, observation.name = "y", data.term.indicator = "keep" method = "cdf", discrete = TRUE) ``` ] --- # Conditional Level .large[**Validate the data model conditional on random effect MLEs**] <br> .pull-left-narrow[ TMB - One-step ahead - gaussian - cdf - generic **DHARMa** - **Conditional** ] .pull-right-wide[ <br><br> Empircal cdf approximation using estimated random effects ```r #Simulate data using MLEs y.sim <- rnorm(u, sigma) #TMB model obj <- MakeADFun(data, parameters, random = "u") opt <- nlminb(obj$par, obj$fn, obj$gr) obj$env$data$simRE <- 0 y.sim <- obj$simulate()$y #empirical cdf on conditionally iid data r <- ecdf(y.sim)(y) ``` ] --- # Results: Conditional Level .pull-left[ Theoretical residuals ![](data:image/png;base64,#TMBvalidation_files/figure-html/unnamed-chunk-25-1.png)<!-- --> Type I error: <table> <tbody> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.042 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.044 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.042 </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> 0.04 </td> <td style="text-align:left;"> </td> </tr> </tbody> </table> .small[ **Significance level**: <br>0.05 <br><br> **Type I error**: <br> Probability of falsely rejecting the correctly specified model <br><br> **Power**: <br> Probabilty of correctly rejecting the mis-specified model ] ] .pull-right[ Estimated Residuals ![](data:image/png;base64,#TMBvalidation_files/figure-html/unnamed-chunk-27-1.png)<!-- --> <table> <tbody> <tr> <td style="text-align:right;background-color: #FFFFFF !important;"> Type I </td> <td style="text-align:right;background-color: #FFFFFF !important;"> Error </td> <td style="text-align:right;background-color: #FFFFFF !important;"> </td> <td style="text-align:right;background-color: #FFFFFF !important;"> </td> </tr> <tr> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.000 </td> <td style="text-align:right;background-color: #FFFFFF !important;border-bottom: 1px solid"> 0.126 </td> </tr> <tr> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> Power </td> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> </td> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> </td> <td style="text-align:right;background-color: #FFFFFF !important;text-align: left;"> </td> </tr> <tr> <td style="text-align:right;background-color: #FFFFFF !important;"> 1 </td> <td style="text-align:right;background-color: #FFFFFF !important;"> 1 </td> <td style="text-align:right;background-color: #FFFFFF !important;"> 1 </td> <td style="text-align:right;background-color: #FFFFFF !important;"> 1 </td> </tr> </tbody> </table> ] --- # Guidelines in Progress <br> - Models with correlated random effect structure need to account for correlation in residuals - TMB's Full Gaussian method is preferred if data and RE are normal - DHARMa rotation needs to be implemented - TMB's one-step methods account for correlation through conditioning - DHARMa's conditional methods account for correlation through conditioning - MCMC approach accounts for correlation but may lack power to detect mis-specification - TMB limitations: - One-step methods cannot be used when data are a mixture of continuous and discrete data - DHARMa limitations: - Estimating the data covariance matrix may be more prone to error when data are not normal