noaa-afsc
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎03_NoCornerStations/q00_NoCornerStations.qmd‎
Lines changed: 3 additions & 3 deletions b/‎03_NoCornerStations/q00_NoCornerStations.qmd‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎04_SamplingBasedZerosSubstitutions/q00_SamplingBasedZerosSubstitutions.qmd‎
Lines changed: 4 additions & 4 deletions b/‎04_SamplingBasedZerosSubstitutions/q00_SamplingBasedZerosSubstitutions.qmd‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎05_sdmTMB/q00_PIBKC_sdmTMB.pdf‎
17.4 KB b/‎05_sdmTMB/q00_PIBKC_sdmTMB.pdf‎
17.4 KB
@@ -11,3 +11,4 @@
 *.zip
 05_sdmTMB/rda_mdls-all_log_depth-st_rw-dg.RData
 05_sdmTMB/rda_prds_log_depth-st_rw-dg.RData
+*.pdf
@@ -152,7 +152,7 @@ params:
                             pdif=2*100*diff/(WCS+NCS));
 ```
 
-### No corner stations {#sec-NCS}
+## No corner stations {#sec-NCS}
 
 As part of the effort to "modernize" the NMFS EBS bottom trawl survey to improve efficiency, the high-density "corner" stations in the NMFS EBS survey grid near the Pribilof Islands and St. Matthew Island ([@fig-NCS-SurveyBasemap]) were dropped for the 2024 survey and will not be sampled in future surveys. Assessment authors were asked to consider the impact of this change on their assessment. This section compares the estimates of mature male biomass (MMB) and other population components derived from previous surveys with corner stations with those obtained by dropping the corner stations in order to obtain a retrospective idea of the importance of the corner stations to estimates of the PIBKC MMB time series.
 
@@ -189,9 +189,9 @@ As part of the effort to "modernize" the NMFS EBS bottom trawl survey to improve
   rm(p,lbl,cap,pth);
 ```
 
-#### Survey biomass estimates
+### Survey biomass estimates
 
-Estimates of biomass from the NMFS EBS bottom trawl survey for different components of the PIBKC stock are compared by year in [@fig-NCS-SrvBio] for calculations made including or excluding the corner stations. Absolute differences and percent differences between the sets of survey stations are shown, respectively, in Figures [-@fig-NCS-SrvBioAbsDif] and [-@fig-NCS-SrvBioPctDif]. The percent differences are also shown in Table [-@tbl-NCS-SrvBioPctDif]. Figures [-@fig-NCS-MMB] and [-@fig-NCS-MMBlog ]provide more detailed looks at the differences between the two sets of survey stations. 
+Estimates of biomass from the NMFS EBS bottom trawl survey for different components of the PIBKC stock are compared by year in [@fig-NCS-SrvBio] for calculations made including or excluding the corner stations. Absolute differences and percent differences between the sets of survey stations are shown, respectively, in Figures [-@fig-NCS-SrvBioAbsDif] and [-@fig-NCS-SrvBioPctDif]. The percent differences are also shown in Table [-@tbl-NCS-SrvBioPctDif]. Figures [-@fig-NCS-MMB] and [-@fig-NCS-MMBlog ] provide more detailed looks at the differences between the two sets of survey stations. 
 
 Estimates of PIBKC MMB were generally higher when the corner stations were included: 25 years versus 17 years when excluding the corner stations resulted in a higher estimate. Since 2013, including the corner stations resulted in higher (or equal) estimates in all years except 2023 and 2024 when no mature males were caught anywhere.
 
 
@@ -126,7 +126,7 @@ params:
   if (!exists("tblno")) tblno = 0;#--index in LETTERS for tables in text
 ```
 
-## Sampling-based zeros substitutions
+## Sampling-based zeros substitutions {#sec-REMA}
 
 ```{r}
   #--load survey data (input biomass units: 1000's t)----
@@ -771,15 +771,15 @@ The results from the five models are compared in Figures [-@fig-Zeros-CompareAll
   lstFigs = c(lstFigs,wtsQMD::printGGplot(p,lbl=lbl,cap=cap,pth=pth,ori="P"));
 ```
 
-###  Summary and Recommendations
+###  Discussion
 
-This report presented a novel approach to "fill in" zeros in a survey abundance or biomass time series to facilitate time series modeling, in particular with the `rema` R package [@rema], based on substituting abundance/biomass values and CVs determined from a user-selected probability of non-detection in place of the zeros. The utility of this approach was explored using annual survey time series for Pribilof Islands blue king crab mature male abundance and biomass; zeros occurred in these time series in 2023 and 2024. `rema` provides three options for dealing with zeros, two for lognormal error distributions (treat 0's as NAs/ignore as missing; substitute a small value for zeros, and a large value for its CV) and one for a Tweedie error distribution (specify a large CV). In the first case, the predicted value for any zero-value observation is the same as the previous predicted value, while its uncertainty increases--the prediction treats the zero-value observation as if it did not occur. In the second case, a small value is provided, together with a large CV, but the values chosen are rather arbitrary and the resulting predictions are generally very sensitive to the values chosen. Using the Tweedie error distribution, the third case, is intuitively attractive because it allows for zero-value observations whereas the lognormal does not. However, it does require that the user specify a non-zero CV for any zeros and models using the Tweedie distribution have been found to take a very long time to converge, in some cases.
+This section presented a novel approach to "fill in" zeros in a survey abundance or biomass time series to facilitate time series modeling, in particular with the `rema` R package [@rema], based on substituting abundance/biomass values and CVs determined from a user-selected probability of non-detection in place of the zeros. The utility of this approach was explored using annual survey time series for Pribilof Islands blue king crab mature male abundance and biomass; zeros occurred in these time series in 2023 and 2024. `rema` provides three options for dealing with zeros, two for lognormal error distributions (treat 0's as NAs/ignore as missing; substitute a small value for zeros, and a large value for its CV) and one for a Tweedie error distribution (specify a large CV). In the first case, the predicted value for any zero-value observation is the same as the previous predicted value, while its uncertainty increases--the prediction treats the zero-value observation as if it did not occur. In the second case, a small value is provided, together with a large CV, but the values chosen are rather arbitrary and the resulting predictions are generally very sensitive to the values chosen. Using the Tweedie error distribution, the third case, is intuitively attractive because it allows for zero-value observations whereas the lognormal does not. However, it does require that the user specify a non-zero CV for any zeros and models using the Tweedie distribution have been found to take a very long time to converge, in some cases.
 
 The approach taken here is to recognize that, for populations at low densities relative to the sampling effort, zero-value catches may be the rule rather than the exception--and to use sampling theory to guide the choice of substitute values for the observed non-detection and associated CV. In the simplest case, the sampling of individuals randomly-distributed across a seascape follows a Poisson distribution, which yields a particularly simple form for the probability of obtaining a series of zero-value catches as a function of the density of those individuals ([@eq-PoissDist]). I inverted this relationship to determine the density ([@eq-Density]) that would yield a given probability of non-detection for a given number of survey hauls. For the Poisson distribution, this also allowed me to determine a corresponding CV. For the PIBKC time series, this approach yielded similar estimates for the smoothed time series under both lognormal and Tweedie error distributions, but the Tweedie model took ~50 times longer to run and had either not yet converged or converged to the upper bound for the variance scale parameter. Interestingly, the Tweedie model with the standard `rema` option (fitting the 0's with user-selected large CVs) converged successfully but the estimates for 2023 and 2024 were simply extensions of the 2022 estimates (albeit with increasing uncertainty)--similar to the results for the lognormal model that treated zeros as missing (NA's). The lognormal model that substituted a user-defined small value (and large CV) for the 0's yielded the least satisfactory results, as the estimates were rather overstrongly pulled to zero.
 
 Given the results presented here and the relative simplicity and interpretability of the distribution-based approach, I would recommend it as the preferred substitution method for time series 0's when fitting `rema` models over the "0's as NAs" and "small constant" methods currently implemented as options for the lognormal error distribution in `rema`. Further I would not recommend using the Tweedie error distribution given its apparent convergence issues. 
 
-As a final note, sampling distributions other than the Poisson distribution discussed here could also be considered to determine reasonable non-detectable densities and associated CVs from which to derive suitable substitutes for survey index-based zero abundance or biomass estimates.
+Sampling distributions other than the Poisson distribution discussed here could also be considered to determine reasonable non-detectable densities and associated CVs from which to derive suitable substitutes for survey index-based zero abundance or biomass estimates. However, for index time series derived from spatially-explicit sampling such as occurs in the NMFS EBS bottom trawl survey, spatiotemporal model-based indices such as those available from a `VAST` or `sdmTMB` (see next section) are probably the most desirable to use.
 
 ```{r}
 #| label: refs_Zeros