The vignette provides a slightly more complicated example of how to
use the fishprior package to combine parameters into a meta-analysis.
The summarize_fishbase_traits() function produces summary
statistics across studies, weighting estimates equally and ignoring
sample sizes or standard errors of estimates. We provide two case
studies below, first examining maximum length of European mackerel
(weighting studies by sample size), and second examining growth rates of
sablefish (weighting studies by standard errors).
FishBase includes sample sizes (Number) and standard
errors (SE) when available, however these may not be
available for most records, and there may be variability in reporting by
species or trait type. For full details, see the following tables
| FishBase.tables.used |
|---|
| rfishbase::popgrowth |
| rfishbase::popchar |
| rfishbase::poplw |
| rfishbase::maturity |
| rfishbase::fecundity |
To illustrate how point estimates may combined in a more traditional
meta-analysis, we’ll use data from the European mackerel (Scomber
scombrus). Data can be obtained from FishBase using
get_fishbase_traits() – more detailed information is in the
introductory vignette. We will focus this example on maximum length
(LengthMax) because it is one of the variables in
rfishbase::poplw() that has an associated sample size, but
not standard error.
species_list <- c(
"Scomber scombrus"
)
example_base_raw <- get_fishbase_traits(spec_names = species_list)
example_base_raw## # A tibble: 288 × 22
## rfishbase SpecCode Sex PopGrowthRef DataSourceRef Locality YearStart
## <chr> <int> <chr> <int> <int> <chr> <int>
## 1 popgrowth 118 unsexed 207 NA Northwest At… NA
## 2 popgrowth 118 unsexed 207 NA Northwest At… NA
## 3 popgrowth 118 unsexed 207 NA Northwest At… NA
## 4 popgrowth 118 unsexed 207 NA Northwest At… NA
## 5 popgrowth 118 unsexed 207 NA Northwest At… NA
## 6 popgrowth 118 unsexed 312 1208 ICNAF Subare… 1974
## 7 popgrowth 118 unsexed 312 1208 ICNAF Subare… 1974
## 8 popgrowth 118 unsexed 312 1208 ICNAF Subare… 1974
## 9 popgrowth 118 unsexed 312 1208 ICNAF Subare… 1974
## 10 popgrowth 118 unsexed 312 1207 New England 1973
## # ℹ 278 more rows
## # ℹ 15 more variables: YearEnd <int>, Number <dbl>, Type <chr>, C_Code <chr>,
## # E_CODE <dbl>, SourceRef <int>, StockCode <dbl>, AgeMatRef <int>,
## # trait <chr>, value <dbl>, SE <dbl>, SD <dbl>, country <chr>,
## # EcosystemName <chr>, Species <chr>
There are differences in reporting between Number and
SE, with fewer studies having standard errors reported.
There may be some rounding or reporting issues with SE,
because at least one of the values is 0. There’s quite a bit of
variation by trait type, with estimates from studies estimating
length-weight regressions reporting Number more commonly
than other studies,

We will filter out studies that don’t report Number, but
an alternative approach would be to set the Number for
these studies to some small value (the smallest reported across all
studies, or 1, etc)
The first way to combine estimates is by applying variance weighting.
To do this, we use Number as the weights for the weighted
mean and variance,
weighted_summary <- filtered_traits |>
dplyr::summarise(
weighted_mean = weighted.mean(value, w = Number),
simple_mean = mean(value),
weighted_var = sum(Number * (value - weighted_mean)^2) / sum(Number),
weighted_se = sqrt(weighted_var / dplyr::n()),
total_n = sum(Number),
n_studies = dplyr::n()
)
weighted_summary## # A tibble: 1 × 6
## weighted_mean simple_mean weighted_var weighted_se total_n n_studies
## <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 45.0 41.6 5.28 0.368 61367 39
The vertical blue lines here represent the weighted mean (and 95% confidence intervals); the weighted mean is slightly larger than the simple mean (42).
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.

Without the reported study specific standard errors, it is difficult to do a true meta-analysis. However we can estimate a common mean and observation variance, which is another way to use the sample sizes as weights. The reported mean length can be assumed to be an observation of the true study-specific unobserved mean length , where represents the variance across studies,
and then the random effects can be estimated with a common hyperparameters
To fit this model in brms, we can use the following
code. Because maximum size can be skewed, we’ll fit the model in log
space,
library(brms)
filtered_traits$sqrt_n <- 1 / sqrt(filtered_traits$Number)
filtered_traits$study_id <- seq_len(nrow(filtered_traits))
filtered_traits$log_n <- log(filtered_traits$value)
formula <- bf(
log_n ~ 1,
sigma ~ 0 + sqrt_n
)
fit <- brm(
formula,
data = filtered_traits,
family = gaussian(),
chains = 4,
iter = 5000,
control = list(adapt_delta = 0.999)
)For cases where the standard errors exist, a second approach is to do
a true meta-analysis and use the standard errors as weights. For this
example, we’ll focus a meta-analysis on the growth coefficient
(k) of sablefish (Anoplopoma fimbria)
species_list <- c(
"Anoplopoma fimbria"
)
sablefish_traits <- get_fishbase_traits(spec_names = species_list)
filtered_traits <- dplyr::filter(
sablefish_traits,
SE > 0,
trait %in% c("K")
)There’s a small number of studies here from 2 regions in the Pacific Ocean, and if we look at breakdown by region, there seems to be substantially smaller values in Alaska.
| Locality | value | SE |
|---|---|---|
| West coast | 0.499 | 0.047 |
| West coast | 0.481 | 0.083 |
| Gulf of Alaska | 0.117 | 0.024 |
| Gulf of Alaska | 0.106 | 0.008 |
| West coast | 0.472 | 0.055 |
| West coast | 0.556 | 0.108 |
| Gulf of Alaska | 0.120 | 0.140 |
| Gulf of Alaska | 0.126 | 0.044 |
It probably makes more sense for interpretation and estimation to focus on a single region, so we will only use estimates from Alaska
We consider three meta-analytic models of increasing complexity, each building on the previous by relaxing assumptions about the sources of variance in the data.
The simplest meta-analytic model treats the reported standard errors as fully characterizing observation-level uncertainty, with no additional residual variance:
where is the observed trait value for study , is the pooled mean (the single parameter being estimated), and is the known standard error from FishBase, treated as fixed. This is the classical fixed-effects meta-analytic assumption — the SE accounts for all variance and there is no additional unexplained noise.
In practice, the reported SE may not capture all sources of variation. Model 2 relaxes the assumption that SE fully characterizes observation uncertainty by additionally estimating a residual variance :
This model has 2 parameters being estimated,
and
).
This is specified in brms using sigma = TRUE within the
se() function:
Model 3 is the most general formulation, adding a study-level random effect to capture among-study variation beyond what is explained by the known SE and residual variance:
where is the standard deviation of study-level random effects, representing among-study heterogeneity. This is the most flexible model but also has the most parameters estimated — with only a small number of studies, and may be difficult to distinguish and estimates may be sensitive to priors.
fit3 <- brm(
value | se(SE, sigma = TRUE) ~ 1 + (1 | study),
data = ak_traits,
prior = c(
prior(normal(0, 10), class = Intercept),
prior(exponential(1), class = sigma),
prior(exponential(1), class = sd)
),
chains = 4, iter = 2000,
control = list(adapt_delta = 0.999)
)