(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Downscaled seasonal forecasts for the California Current System: Skill assessment and prospects for living marine resource applications [1]

['Michael G. Jacox', 'Environmental Research Division', 'Noaa Southwest Fisheries Science Center', 'Monterey', 'California', 'United States Of America', 'Physical Sciences Division', 'Noaa Earth System Research Laboratories', 'Boulder', 'Colorado']

Date: 2023-12

Ocean forecasting is now widely recognized as an important approach to improve the resilience of marine ecosystems, coastal communities, and economies to climate variability and change. In particular, regionally tailored forecasts may serve as the foundation for a wide range of applications to facilitate proactive decision making. Here, we describe and assess ~30 years of retrospective seasonal (1–12 month) forecasts for the California Current System, produced by forcing a regional ocean model with output from a global forecast system. Considerable forecast skill is evident for surface and bottom temperatures, sea surface height, and upper ocean stratification. In contrast, mixed layer depth, surface wind stress, and surface currents exhibit little predictability. Ocean conditions tend to be more predictable in the first half of the year, owing to greater persistence for forecasts initialized in winter and dynamical forecast skill consistent with winter/spring influence of the El Niño–Southern Oscillation (ENSO) for forecasts initialized in summer. Forecast skill above persistence appears to come through the ocean more than through the atmosphere. We also test the sensitivity of forecast performance to downscaling method; bias correcting global model output before running the regional model greatly reduces bias in the downscaled forecasts, but only marginally improves prediction of interannual variability. We then tailor the physical forecast evaluation to a suite of potential ecological applications, including species distribution and recruitment, bycatch and ship-strike risk, and indicators of ecosystem change. This evaluation serves as a template for identifying promising ecological forecasts based on the physical parameters that underlie them. Finally, we discuss suggestions for developing operational forecast products, including methodological considerations for downscaling as well as the respective roles of regional and global forecasts.

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Here, we present a ~30-year retrospective skill assessment for the first downscaled seasonal forecasts spanning the U.S. portion of the California Current System (CCS) and extending ~1000 km offshore. In doing so, our aims are to (i) evaluate forecast skill for coastal and offshore conditions in the CCS, including for metrics of vertical water column structure that have rarely (if ever) been evaluated in ocean predictions, (ii) demonstrate a framework for using physical forecast skill assessment as a key step toward identifying ecological forecasts that are poised to be skillful, and (iii) provide insight for developing operational forecast products, including methodological considerations and the role of regional vs. global forecasts.

One region for which downscaled seasonal forecasts hold interest and potential is the California Current System (CCS), running along the west coast of North America. As one of the world’s major eastern boundary upwelling systems, the CCS supports a highly productive marine biome and human activities including socio-economically important fisheries and tourism [ 11 ]. The physical, chemical, and biological environment of the CCS exhibits considerable variability on subseasonal, seasonal, and interannual timescales [ 12 ], with notable ecological disruptions arising from climate anomalies including delayed upwelling [ 13 ], ENSO events [ 14 and references therein], and other marine heatwaves [ 15 ]. Importantly, largely due to the influence of ENSO, the CCS is one of the more predictable areas of the ocean, especially among extratropical regions [ 16 , 17 ].

A logical approach to creating marine ecosystem forecasts is to start from skillful physical climate forecasts, which are then connected by statistical or mechanistic models to ecological targets of interest. In this construction, a wide variety of ecological models developed for retrospective or near-real-time application could be readily transitioned to forecast mode (e.g., Section 3.4). The physical predictions can be global or regional in scale, but while global climate forecasts are widely available, oceanographic output from these models is often limited to few surface variables, predominantly sea surface temperature (SST) [ 7 ]. Furthermore, there is a desire for predictions of fine-scale features and processes that are relevant to marine resource management but are not resolved by global forecast systems. Dynamical downscaling allows for regionally tailored models with relatively high resolution, and the downscaling process can also be used to correct for regional biases in global models. As a result, dynamical downscaling has been widely used to produce regional ocean projections under climate change [ 8 ]. However, examples of dynamically downscaled seasonal forecasts and assessments of their forecast skill are less common [though see 9 , 10 ].

The need for proactive ocean decision making is now well established, as the impacts of climate variability–especially extreme events–highlight the limitations of static management strategies. In the U.S., for example, myriad ecological, social, and economic impacts of ocean variability over the past decade have motivated a growing push to incorporate forward-looking environmental information in the management of coastal communities, fisheries, and marine protected areas [ 1 , 2 ]. To that end, seasonal (1–12 month) ocean forecasts have an important role to play when managing marine resources in a changing climate [ 3 ], filling a critical gap between real-time information and long-term outlooks. For example, seasonal forecasts have the potential to inform decisions on fish catch limits, timing of fishery closures, and allocation of monitoring resources [ 4 ], and past work has highlighted both the promise and challenges of marine ecosystem prediction on seasonal timescales [ 4 – 6 ].

For each application, we identified the region, physical variables, and months of the year for which forecasts are needed ( Table 1 ). For this purpose, each application is associated with one or more broad regions identified as north, central, and south, with divisions at Cape Mendocino (~40.5°N) and Point Conception (~34.5°N), each extending 300 km offshore ( Fig 1 ). We then summarized the physical forecast skill given the requirements of each application, with the goal of providing an initial assessment of how likely they are to be successfully transitioned to a forecast framework (Section 3.4). Specifically, we collated ACCs for each variable/region/forecast month associated with a given application and then characterized the distribution of those ACCs. For example, in the case of sardine spawning habitat and recruitment, we consider the distribution of just six ACCs (one region–the Southern CCS, two variables–SST and SSH, and three forecast months–March, April, and May). On the other extreme, for EcoCast we consider the distribution of 180 ACCs (three regions, 10 variables, six forecast months). All forecasts are taken from the most recent initialization (i.e., forecasts of January–June are from the January initialization, forecasts of July–December are from the July initialization). For applications that rely on a large suite of variables, we also consider the skill for a smaller subset of variables that exhibit relatively high skill (SST, SSH, and buoyancy frequency). Since the forecasts do not include biogeochemistry, chlorophyll concentration is excluded from this analysis despite being included as a predictor in several of the applications.

Forecast skill is shown for conditions averaged over the 75 km coastal band shown in Fig 1 , except bottom temperature, which is averaged to a maximum bottom depth of 1000 m. Four skill metrics are reported here: anomaly correlation coefficient (ACC), forecast accuracy, bias, and root mean square difference (RMSD). Bias and RMSD are standardized (i.e., divided by the standard deviation of anomalies across all years) to facilitate comparison between variables on the same scale. For ACC, gray dots indicate model forecast skill that is significantly greater than zero (95% confidence), while white dots indicate model forecast skill that is significantly greater than zero when persistence forecast skill is not significantly greater than zero. Forecast accuracy is shown for the upper tercile of each variable. For ACC and forecast accuracy, skill below that of a random forecast (0 for ACC, 0.56 for forecast accuracy) is colored white. Variable abbreviations (top to bottom) are: sea surface temperature (sst), sea surface height (ssh), bottom temperature (bt), buoyancy frequency (bf), mixed layer depth (mld), surface eastward and northward current velocities (su, sv), surface eastward and northward wind stress (sustr, svstr), and wind stress curl (curl).

Key parameters for previously identified potential living marine resource forecast applications. Regions are divided into north (40.5–47.5N), central (34.5–40.5), and south (30.5–34.5), with each extending 300 km offshore ( Fig 1 ). The “Full suite” of variables includes all of the variables assessed herein (e.g., Fig 2 ), with the exception of bottom temperature. The “Spatially explicit?” column denotes whether an application requires on spatially resolved data (Yes) or an average over a larger area (No).

As a first step toward assessing the broader applicability of the physical forecasts described here, we have identified a suite of potential ecological forecast applications in the California Current System ( Table 1 ). In brief, they are (1) estimates of Pacific sardine spawning habitat and recruitment off Southern California [ 49 , 50 ], (2) the Temperature Observations to Avoid Loggerheads (TOTAL) tool, designed to limit bycatch of Loggerhead turtles in the Southern California Bight [ 51 ], (3) the Habitat Compression Index (HCI), a measure of nearshore cold water habitat related to ecosystem changes and whale entanglement risk off central California [ 52 , 53 ], (4) distribution of suitable albacore habitat off Oregon and Washington [ 54 ], (5) the Anchovy Ecosystem Indicator (AEI), an indicator of top predator foraging and reproduction off central California [ 55 ], (6) EcoCast, a tool designed to limit bycatch and promote sustainable fishing for swordfish in the California drift gillnet fishery [ 56 ], and (7) WhaleWatch, a tool to predict blue whale distributions and identify ship strike risk in the Southern California Bight [ 39 , 57 ]. Note that this is a non-exhaustive list of potential applications that have not yet been implemented in a regional forecast framework. There are other west coast applications for which spatially-explicit forecast capability has been demonstrated, e.g., for Pacific sardine distribution [ 58 ], Pacific hake distribution [ 59 ], and Dungeness crab catch rates and megalopae occurrence [ 60 , 61 ].

For each forecast, an ensemble mean forecast is calculated as the mean of the three ensemble members ( Fig 1 ). The ensemble mean forecast is the main focus of our results (Section 3), with Section 3.2 discussing the characteristics of individual ensemble members.

As in previous forecast skill assessments (e.g., [ 16 ]), forecast skill is evaluated for significance using ACC. First, sample autocorrelation is accounted for by calculating the effective degrees of freedom (N eff ) as in [ 48 ]: where N is the number of samples in the forecast time series, and r τ F and r τ O are autocorrelation coefficients for the forecast and observed time series, respectively, at lag τ. We then apply Fisher’s z transformation to the Pearson correlation between forecast and observations (r F,O ): where Z F,O follows a z distribution with Neff– 3 degrees of freedom. We calculate 95% confidence bounds for Z F,O and then transform back to r F,O ; if the lower confidence bound is greater than zero, the forecast skill is significant. As a further benchmark against which model forecasts can be evaluated, we also calculate forecast skill for persistence forecasts, which assume that anomalies observed in the month prior to forecast initialization will persist throughout the forecast.

Forecast skill is quantified using a suite of deterministic and probabilistic metrics that have been previously employed for climate forecast evaluation [ 17 , 46 ]. Each metric is calculated across all forecast years (N = 29) for each initialization month and lead time. Bias, the mean difference between forecast and observed fields, is used to determine the ability of forecasts to represent the mean state. Anomaly correlation coefficient (ACC), the correlation between forecast and observed anomalies, is used to assess forecast skill for interannual variability. Forecast anomalies are calculated using a lead-dependent forecast climatology to account for model drift. The combined influence of errors in the forecast mean state and variability is quantified using the root mean squared difference (RMSD) of observed and forecast fields. Finally, forecast accuracy provides a probabilistic skill measure for categorical predictions. Here, the categories are terciles of each variable, such that forecast accuracy for a given initialization and lead time is the fraction of forecasts that correctly predicted whether an event (e.g., SST being in the upper tercile) would occur [ 47 ]. ACC varies between -1 and 1, with 1 being perfect and 0 being equal to chance. Forecast accuracy varies between 0 and 1, with 1 being perfect. For terciles, a random forecast is expected to give forecast accuracy of 0.56 (the random probability of a true positive plus the random probability of a true negative; ⅓*⅓ + ⅔*⅔). For bias and RMSD, smaller numbers are better.

Lastly, we leverage a pre-existing set of regional ocean model runs to explore the relative contributions of atmospheric and oceanic forcing to forecast skill (see Section 4). In brief, the same forcing used to drive CCSRA was applied to a historical simulation without data assimilation. This run (referred to herein as a hindcast) differs slightly from CCSRA but captures much of the same variability due to the common forcing. The surface fluxes and ocean boundary conditions for the hindcast were then used to drive two additional runs: one with realistic wind forcing but climatological ocean boundary conditions (the “Wind” run) and one with realistic ocean boundary conditions but climatological wind forcing (the “Ocean” run). These runs, described in detail in [ 28 ], were designed to separate the influence of wind forcing over the regional domain from ocean forcing that comes through the lateral boundaries of the ocean model.

Additional SST measurements for verification were taken from version 2.1 of NOAA’s daily 0.25° resolution Optimum Interpolation SST product (OISST [ 43 , 44 ]) as well as shore-based observations maintained by a network of institutions along the U.S. west coast ( https://shorestations.ucsd.edu ). Focusing on locations with comprehensive coverage for the 1982–2010 forecast verification period, we used data from four locations along the California coast: the Farallon Islands (37.7°N), Pacific Grove (36.6°N), Newport Beach (33.6°N), and La Jolla (32.9°N). Similarly, to evaluate SSH forecasts at coastal sites, tide gauge data were obtained from the University of Hawaii Sea Level Center (Caldwell et al. 2015) [ 45 ] for nine sites from Toke Point, WA (46.7°N) to La Jolla, CA (32.9°N). For model verification, shore station and tide gauge data were compared to the closest coastal grid cell in the model. For all verification datasets, monthly averages were computed from daily data, and monthly anomalies were calculated by removing the 1982–2010 monthly climatology.

Forecast skill was evaluated using ocean reanalyses, satellite observations, and several in situ data sets. The UC Santa Cruz CCS reanalysis (CCSRA [ 35 ]) is based on the same ROMS model domain described in Section 2.1.3, though with different atmosphere and ocean boundary forcing. This reanalysis spans 1980–2010 and assimilates satellite SST and SSH data as well as available in situ salinity and temperature data. As a verification data set, CCSRA has the advantages of matching the domain and resolution of the downscaled forecasts and resolving the full water column. Past analyses have shown that surface and water column properties of CCSRA compare favorably to observations from ship-based surveys, satellites, shore stations, underwater gliders, and Argo floats [ 35 , 40 , 41 ]. A second ocean reanalysis, the Global Ocean Reanalysis and Simulations (GLORYS) version 1 was also used as a separate verification data set. GLORYS has global coverage at 1/12 degree horizontal resolution with 50 vertical levels. GLORYS uses the Nucleus for European Modelling of the Ocean (NEMO) ocean model with surface forcing from the ECMWF ERA-Interim reanalysis, and is available for 1993–2019. Similar to CCSRA, GLORYS assimilates satellite SST and SSH observations as well as in situ temperature and salinity, and exhibits high fidelity to independent observations in the CCS [ 41 , 42 ].

Forecast skill was evaluated for a suite of atmospheric and oceanic variables. In the atmosphere, we examined surface wind stress in the eastward and northward directions (sustr and svstr, respectively) as well as wind stress curl (curl). At the ocean surface, we examined sea surface temperature (SST), sea surface height (SSH), and surface currents in the eastward and northward directions (su and sv, respectively). The upper ocean water column structure was characterized by the mixed layer depth (MLD), defined by a density change corresponding with a 0.8°C temperature deviation [ 36 ], and the buoyancy frequency (BF; a measure of stratification) averaged over the upper 200m. Lastly, bottom temperature (BT) was included for its importance to benthic species, with a focus on the nearshore environment where bottom depth is less than ~1000 m. These variables were chosen for their potential as predictors of variability in the distributions and populations of marine species ranging from phytoplankton and krill [ 37 , 38 ] to fish, sharks, and top predators [ 30 , 39 ]. Monthly averages of all variables were computed for skill assessment.

Given the different configurations of their initialization and forcing, CanCM4-ROMS and CanCM4-ROMS-BC represent opposite ends of the spectrum for regional tailoring of the downscaled forecasts; CanCM4-ROMS is forced solely by the global forecast output while CanCM4-ROMS-BC relies on high resolution atmosphere and ocean reanalyses to construct both the initial condition and the forcing. While we focus on CanCM4-ROMS-BC to quantify the potential skill of a CCS seasonal forecast system, comparison between the two configurations (Section 3.3) provides insight into the added value of bias-correcting the forcing and using a near-real-time regional ocean analysis for forecast initialization, both of which are non-trivial additions to the workflow.

Initialization was also handled differently between the two forecast configurations. In CanCM4-ROMS, the ocean initial conditions were interpolated directly from CanCM4 output. Since CanCM4 ocean fields were provided as monthly averages, the initial condition was obtained by averaging the zero-lead forecasts for the two months surrounding the initialization date. For example, the initial condition for the Jan. 1, 2000 forecast was obtained by averaging the December conditions from the December 1999 forecast and the January conditions from the January 2000 forecast (note that in a true forecast mode, the global model should supply initial conditions from the beginning of the month). In CanCM4-ROMS-BC, the initial condition was obtained from the UC Santa Cruz CCS ocean reanalysis ([ 35 ]; Section 2.2.2). Using the historical reanalysis to initialize retrospective forecasts simulates an approach in which operational forecasts are initialized from a near-real-time regional analysis available for the CCS ( oceanmodeling.ucsc.edu/ccsnrt , version 2016a).

ROMS forcing fields were derived from the CanCM4 global forecasts in two ways: one in which CanCM4 fields were interpolated directly to the ROMS grid (denoted CanCM4-ROMS), and one in which the CanCM4 fields were bias corrected before being used as forcing for ROMS (denoted CanCM4-ROMS-BC). In the latter case, the bias-corrected forcing was constructed by (i) subtracting the initialization- and lead time-dependent climatology from CanCM4 forecast fields to obtain CanCM4 forecast anomalies [as in 16 ], and then (ii) adding the low-resolution CanCM4 anomalies to high resolution 1982–2010 climatologies of the surface and ocean boundary conditions, obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) version 5 atmospheric reanalysis (ERA5 [ 33 ]) and the Simple Ocean Data Assimilation version 2.1.6 (SODA [ 34 ]), respectively. For this second step, surface CanCM4 anomalies from the ocean point closest to land were extended over land (i.e., the land was flooded) to avoid artifacts associated with the land-sea interface during interpolation, and then bicubic interpolation was used to put CanCM4 anomalies on the higher resolution grid.

Due to computational constraints, the full suite of global climate forecasts (12 forecasts per year, 10 ensemble members) was reduced to a smaller subset for dynamical downscaling. To capture seasonal differences in predictability, we downscaled forecasts initialized in January and July, and three ensemble members were downscaled for each forecast. A three-member ensemble was chosen to balance the computational and storage costs of running additional members with the diminishing returns of skill improvement as the ensemble gets larger [ 6 ]. For each forecast, we used ensemble members 2, 8, and 10 in the CanCM4 output. When treated as independent forecasts these three ensemble members exhibit relatively low, moderate, and high retrospective SST forecast skill in the CCS, giving us forecasts that are representative of what should be expected from a typical 3-member ensemble ( S1 Fig ).

Forecasts were dynamically downscaled using a CCS configuration of the Regional Ocean Modeling System (ROMS), covering the U.S. west coast and extending ~1000 km offshore (30–48˚N, 134–115.5˚W). Model resolution is 0.1 degrees (~7–11 km) in the horizontal, with 42 terrain-following levels in the vertical [ 22 ]. This model and its variants have been used extensively to evaluate the CCS physical and biogeochemical ocean state [ 23 – 25 ], its variability and response to climate forcing [ 26 – 28 ], and its influence on marine biological communities ranging from phytoplankton to top predators [ 29 – 32 ].

Surface forcing and lateral ocean boundary conditions for the downscaled forecasts were derived from the fourth version of the Canadian Center for Climate Modeling Analysis’s coupled climate model (hereafter CanCM4 [ 18 ]). Forecasts were initialized every month, with each forecast extending 12 months into the future (i.e., with lead times up to 12 months). To capture uncertainty due to internal climate variability, each forecast includes 10 ensemble members initialized from separate data assimilation runs such that their initial conditions are slightly different [ 18 ]. CanCM4 output is available for all atmospheric and oceanic variables needed to force the regional ocean model over the 1982–2010 period; surface atmospheric conditions and radiative forcing are available at daily frequency, while three-dimensional ocean boundary conditions are available as monthly averages. CanCM4 is one of the models in the North American Multimodel Ensemble (NMME [ 19 ]), and has been shown to exhibit considerable forecast skill for SST in the CCS [ 20 ]. Other global climate forecast systems exhibit similar patterns in skill to those of CanCM4 [ 17 , 20 ], and for seasonal forecasts the intramodel spread is much larger than the intermodel spread [e.g., 21 ], so we ran multiple ensemble members from CanCM4 rather than running members from multiple models.

(left) California Current System ROMS domain, with color indicating mean SST for the 1982–2010 period of the study. Black contours indicate 75 and 300 km from shore, as well as divisions between north, central, and south regions. (right) Output from downscaled forecasts for (top) SST and (bottom) SST anomaly averaged within the 75 km coastal band outlined on the map. Forecast conditions are shown for two sets of downscaled runs, which differ in whether the forcing derived from the CanCM4 global models was bias corrected (CanCM4-ROMS-BC) or not (CanCM4-ROMS) prior to downscaling. The forecast initialization (colored circles), ensemble mean (colored lines) and ensemble spread (colored shading) are shown for each forecast, with observed conditions in black. For clarity, only forecasts initialized in January for 2000–2009 are shown.

The regional ocean forecast system employed here consists of a regional ocean model ( Fig 1 ) forced by output from a global climate forecast system. In order to thoroughly evaluate the performance of the system, we have run a set of retrospective forecasts (reforecasts) spanning approximately three decades (1982–2010). Details of the forecast system, including the global climate forecasts, the regional ocean model, and the downscaling procedure, are described below.

For each of the potential forecast applications summarized in Table 1 , box plots show the range of anomaly correlation coefficients for the variables, months, and regions relevant to that application (see methods ). In the reduced models (light green), the same approach is applied but variables are limited to SST, SSH, and buoyancy frequency. Box edges are defined by the 25 th and 75 th percentiles. Whisker lengths are based on the 1.5 IQR value, i.e., the maximum whisker length is 1.5 times the interquartile range (IQR; difference between 25 th and 75 th percentiles) and whiskers extend to the most extreme data point within that distance. Medians and means are indicated by vertical lines and filled circles, respectively.

Ultimately, the motivation for developing and evaluating the ocean forecasts described here is to support their broader use, particularly in ecological applications to improve living marine resource management. While we don’t explicitly evaluate ecological forecasts here, we can tailor the physical forecast evaluation to specific applications as an initial evaluation of their promise. For a set of potential applications in the CCS ( Table 1 ), the complexity of the required physical forecasts varies widely, with some needing only spatially-averaged SST forecasts for a given season and others requiring a large suite of spatially-explicit oceanographic and atmospheric variables for much of the year. These varying levels of complexity are reflected in our application-specific skill assessment. Forecasts of sardine distribution and recruitment, TOTAL, and the HCI show more potential as they depend solely on SST and/or SSH, which have higher mean skill (ACC = 0.4–0.6) as well as lower spread across the requisite physical forecasts ( Fig 12 ). Mean skill is lower (ACC = 0.2–0.3) and spread is higher for the broad suite of variables, including stratification, mixed layer depth, winds, and currents, required by the albacore distribution, AEI, EcoCast, and WhaleWatch applications. For these applications, some of the underlying forecasts are very skillful (ACC>0.8) while others are no better than a random guess (ACC<0). The forecast potential for these complex models would increase if they could be parametrized to use a reduced set of more skillful variables (SSH, SST, buoyancy frequency; Fig 12 ), though in some cases a more parsimonious model may not retain sufficient ecological skill [ 57 ].

Differences in skill metrics calculated from forecasts run using bias corrected forcing (CanCM4-ROMS-BC) and those without bias correcting forcing (CanCM4-ROMS) are shown for multiple variables, lead times and initialization months (as in Fig 2 ). Improvements due to bias correction are indicated by increases (red) for ACC and forecast accuracy, and decreases (blue) for bias and RMSD. Note that bias in SSH can simply reflect differences in the reference for sea level and not necessarily model performance.

Not surprisingly, bias-correcting the global model output tends to reduce bias in the downscaled forecasts (e.g., Fig 1 ), with the most pronounced impacts seen in bottom temperature, wind stress curl, and meridional wind stress and currents ( Fig 11 ; note that biases in SSH can reflect different reference levels for sea level, not necessarily model performance). However, bias correction can also be performed after downscaling, provided that one can reliably quantify climatological ocean conditions (e.g., with observations or reanalyses). Thus, perhaps of more interest is how the forecast configuration impacts the representation of interannual variability. Comparing CanCM4-ROMS to CanCM4-ROMS-BC, we see that impacts on anomaly forecast skill are generally subtle, though in some cases there do appear to be improvements (see ΔACC and Δforecast accuracy in Fig 11 ). In particular, the bias-corrected runs benefit from being initialized with the near-real-time regional analysis, generating improved skill especially at shorter lead times. Most notable are improvements in SSH forecasts, as the narrow coastal band most influenced by cross-shore Ekman transport is not resolved in the global model (and therefore in the CanCM4-ROMS run for which initial conditions are interpolated directly from the global model). While these improvements may be inflated due to the “perfect” initialization of CanCM4-ROMS-BC relative to the CCSRA verification dataset, the improvements in SSH anomaly forecast skill are confirmed by evaluating both forecast configurations against tide gauge data (not shown).

A key methodological decision when forcing regional ocean models with output from global climate models is how to correct for biases in the global model output. One approach is to force the regional model with output that has not been adjusted, simply interpolating the forcing fields to the regional grid (e.g., [ 9 ]). Advantages of this method are its relative simplicity and that it maintains dynamical consistency of the forcing fields, with the tradeoff that the downscaled model will inherit biases that need to be accounted for. Alternatively, the global forcing fields can be corrected before running the regional model, with the aim of producing a more realistic ocean run that does not require post hoc bias correction. Here we employed both approaches to evaluate the implications of choosing one or the other.

By design, the three CanCM4 forecast members chosen for downscaling vary considerably in their SST forecast skill (see Methods and S1 Fig ), and the downscaled forecasts reflect the skill of the global ensemble members that forced them. Anomaly correlation coefficients for SST, averaged across all lead times and initializations, range from ~0.2 (member 2) to ~0.4 (member 10; Fig 10 ). The same pattern is also apparent for most other variables, indicating that a forecast that performs relatively well for one aspect of the ocean state (e.g., SST) is likely to also perform relatively well for others (e.g., SSH, bottom temperature, stratification). Furthermore, the skill of the ensemble means tend to be as good as or better than that of the best individual ensemble member. This finding is common in assessments of forecast ensembles (e.g., [ 17 ]) and demonstrates the value of including even relatively poorly performing forecasts in the ensemble.

(top) Mean spread (i.e., max–min) of the three-member forecast ensemble, averaged across all lead times from the July initialized forecasts, for sea surface temperature (SST), sea surface height (SSH), bottom temperature (BT), and buoyancy frequency (BF). (bottom) Observed variability in these same variables, calculated as the standard deviation of monthly anomalies from the entire study period. Note that scales are different between top and bottom panels, and maps of BT and BF are zoomed into the coast where signals are strongest.

By comparing individual ensemble members, we can glean further information on the uncertainty of forecasts as well as relationships between forecast skill for different variables. Forecast spread (i.e., the difference between the highest and lowest forecasts for a given variable) varies widely throughout the CCS domain, consistent with patterns of variability in the region ( Fig 9 ). For SST, bottom temperature, and stratification, the greatest forecast spread occurs in a narrow coastal band, where high variability coincides with coastal upwelling and relatively shallow bottom depths. In contrast, forecast spread for SSH is greatest several hundred kilometers offshore of California where eddy kinetic energy is at a maximum [ 69 ] and forecast skill is relatively poor ( Fig 5 ).

(left) time series of observed SST and January initialized forecast SST for 1982–2010 at four locations along the California coast: Farallon Islands (37.7°N), Pacific Grove, CA (36.6°N), Newport Beach (33.6°N), and La Jolla (32.9°N). Forecast initializations are shown as colored circles. (right) Forecast skill (as measured by anomaly correlation coefficient) for each location as a function of initialization month and lead time. Dashed lines show the skill of persistence forecasts, which serve as a baseline against which model forecast skill can be compared.

Shore-based observations of SST and sea level provide additional data sets with which to verify our forecasts. Evaluating the forecasts with these in situ datasets leads to patterns in forecast skill similar to those described above (Figs 7 and 8 ). Specifically, forecasts initialized in January have higher skill at short lead times, forecasts initialized in July have higher skill at longer lead times, and there is a clear latitudinal gradient in SSH forecast skill (e.g., see July-initialized lead 6 forecast in July Fig 8 ). While these patterns are consistent across verification datasets, comparing forecasts to shore-based measurements gives reduced forecast skill (i.e., lower anomaly correlations) and highlights a difference in forecast and observed variance, with forecasts exhibiting lower variability than in situ measurements (Figs 7 and 8 ). These discrepancies likely arise from two key sources. First, model output (from both forecasts and the reanalysis) represents conditions averaged over a 0.1° (~10 km x 10 km) box, while shore stations are point measurements and thus can be influenced by very local processes and bathymetric features; these scale differences alone would cause differences even with perfect forecasts. Second, since the forecasts use the same model configuration as the reanalysis (and are initialized from it for CanCM4-ROMS-BC), skill metrics calculated using the reanalysis are likely optimistic. For example, relative to the reanalysis, the CanCM4-ROMS-BC initialization is perfect, but in reality forecasts will be initialized with some errors. This second effect can be explored using alternate gridded data sets for verification, and we find that forecast skill changes little when using OISST or GLORYS for verification rather than CCSRA ( S5 and S6 Figs). Given these considerations, the true forecast skill likely falls between the estimates obtained from verification with CCSRA and with the shore-based time series.

In addition to temporal patterns in forecast skill, spatial structure is also apparent over the CCS domain in both the cross-shore and latitudinal directions (Figs 4 – 6 , S4 Fig ). For SST ( Fig 4 ), the nearshore environment is especially unpredictable in summer (e.g., 6.5-month forecasts initialized in January, 1.5-month forecasts initialized in July), while in winter it exhibits relatively high forecast skill (e.g., 6.5-month forecasts initialized in July). As noted above, this coastal predictability in winter is consistent with forcing from ENSO, and it is even more apparent in SSH ( Fig 5 ). The coastal SSH predictability can arise from ENSO’s atmospheric teleconnection (through changes in alongshore winds and associated Ekman transport [ 20 ]) as well as the oceanic teleconnection (poleward propagation of coastal trapped waves [ 66 ]). The latter is also consistent with an alongshore gradient in forecast skill, as the oceanic teleconnection is stronger in the southern CCS than it is farther north [ 67 , 68 ]. While [ 20 ] suggested that the atmospheric teleconnection was largely responsible for ENSO-related SST forecast skill in a 10-member global forecast ensemble, here we find that the oceanic teleconnection appears to dominate dynamical forecast skill for a smaller, higher-resolution forecast ensemble (see Section 4). For bottom temperature, short-lead forecast skill is generally higher at deeper depths, likely due to greater persistence, while winter forecasts initialized in July exhibit higher skill at shallower depths ( Fig 6 ).

In the nearshore zone, SSH is the most predictable of all variables evaluated here. When evaluated with the CCS ocean reanalysis, SSH forecasts exhibit significant skill at all lead times for both January and July initializations. For SST, bottom temperature, and stratification, forecast skill is dependent more on the time of year being predicted than on the time of year when forecasts are initialized; both January- and July-initialized forecasts exhibit elevated skill for predictions of winter and spring conditions. For the January forecasts this period is the first half of the forecast and much of the skill can be attributed to persistence of anomalies present at the time of initialization. In contrast, for forecasts initialized in July, skill in the winter-spring period (corresponding to lead times of >6 months) is significantly higher than that of a persistence forecast ( Fig 2 ). Elevated skill from the dynamical forecasts in the winter/spring period is consistent with the influence of the El Niño-Southern Oscillation (ENSO), whose teleconnections represent a predictable signal impacting the CCS in the winter and spring following the peak of El Niño and La Niña events [ 20 , 63 ]. The patterns in persistence forecast skill are consistent with mixed layer dynamics, where a deep winter mixed layer has greater thermal inertia than the shallow summer mixed layer. Persistence is much more pronounced for bottom temperature than for SST, with significant skill extending 9 months and 4 months for January- and July-initialized forecasts, respectively (compared to 5 months and 1 month for SST). This elevated skill for bottom conditions at lead times up to ~4 months was also noted by [ 9 ]. Additional mechanisms that may contribute to predictability along the U.S. west coast include advection and reemergence [ 6 , 64 , 65 ], though these mechanisms have not been assessed in ocean forecasts.

For a coastal band along the U.S. west coast (i.e., within 75 km of shore), the physical variables evaluated here can be grouped into two broad categories: those that can be forecast skillfully at relatively long lead times of several months or more, and those for which significant forecast skill is limited to lead times less than three months ( Fig 2 ). The former set of variables includes SSH, SST, bottom temperature, and to a lesser extent, upper ocean buoyancy frequency (stratification), while the latter includes mixed layer depth as well as surface wind stress and currents. Despite some spatial structure in forecast skill, the qualitative distinction between more and less skillful variables holds across the broader offshore area of the model domain ( Fig 3 , S2 and S3 Figs). Given that the atmosphere has much shorter decorrelation scales than the ocean, it is not surprising that there is limited seasonal forecast skill for surface wind variability, and by extension for mixed layer depth and surface currents, which are strongly driven by wind forcing. For these variables, a more thorough exploration of subseasonal (i.e., 45–60 day) forecasts is likely warranted, as there is a wide array of potential marine resource management applications on those shorter timescales [ 62 ]. For the remainder of this section, we focus on the variables that are more predictable on seasonal timescales.

4. Discussion

The multidecadal set of downscaled ocean reforecasts described herein enables an evaluation of CCS forecast skill, including its dependence on the variable being considered, initialization month, lead time, and location within the CCS. In general, the variables we examined can be separated into groups for which forecast skill was relatively good (SST, SSH, bottom temperature, stratification) or poor (MLD, surface winds, surface currents). Forecast skill is generally higher for the first half of the year than the latter half, with forecasts initialized in January benefitting from strong persistence and forecasts initialized in July benefitting from predictability for late winter/spring conditions that is consistent with the influence of ENSO.

The late winter/spring forecast skill above persistence is most pronounced near the coast, especially for SSH. This skill could arise from tropical-extratropical teleconnections through the atmosphere (i.e., alteration of local winds and heat fluxes [28, 70]) and through the ocean (i.e., poleward propagation of coastal trapped waves [66]). Additional exploratory analyses suggest that the latter mechanism is the domain one driving skill in the regional forecasts. First, SST skill above persistence is more closely related to skill above persistence for SSH (a proxy for the oceanic teleconnection) than for surface wind stress (a proxy for the atmospheric teleconnection), especially in the southern CCS (Fig 13). Second, when evaluating forecasts against the “Wind” and “Ocean” sensitivity runs (see Methods), the total skill above persistence is more strongly associated with the Ocean forcing, indicating that the variability being predicted successfully is largely associated with associated with the oceanic teleconnection (Fig 14).

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 13. Relationship of skill above persistence for SST, SSH, and surface winds. CanCM4-ROMS-BC minus persistence forecast skill is shown for July-initialized SST, SSH, and surface wind stress forecasts in nearshore (0–75 km) subregions of the CCS. Wind stress skill is the average of skill for the zonal and meridional components (sustr, svstr) and is plotted with one-month lead (i.e., the SST response lags wind by one month). Correlations are shown between SST and wind (blue; proxy for atmospheric teleconnection) and SSH (red; proxy for oceanic teleconnection). https://doi.org/10.1371/journal.pclm.0000245.g013

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 14. SST forecast skill relative to atmospheric and oceanic forcing. For July-initialized CanCM4-ROMS-BC forecasts at 7.5-month lead, (top) SSH and (bottom) SST forecast skill are shown using different model outputs for verification: the UCSC ocean reanalysis (CCSRA), the Hindcast (like CCSRA but without data assimilation), the Wind run (realistic wind forcing but climatological ocean boundary conditions), and the Ocean run (realistic ocean boundary conditions but climatological winds). Correspondence between the CCSRA and Hindcast shows that verification using the model run without data assimilation still captures forecast skill. Correspondence between the Hindcast and either the Wind or Ocean run shows which forcing mechanism is more closely tied to the model forecast skill. https://doi.org/10.1371/journal.pclm.0000245.g014

Beyond quantifying the potential for regional forecasts to predict ecologically-relevant ocean variability in the CCS, we can also shed light on additional considerations for operational implementation of such a system. There are now multiple examples of the general approach taken here–linking global climate forecasts to a regional ocean model, which can subsequently drive various types of ecological models [e.g., 9, 10]. Moving such a system from retrospective to real-time applications requires that the necessary forcing fields from the global climate forecast systems be made available in real time, either publicly or in direct collaboration between global and regional modelers. For most global forecast systems, the necessary fields are not readily available, which has been identified as a bottleneck for marine ecosystem prediction [7]. Once forcing fields are obtained, there are methodological questions regarding bias correction and initialization of the regional model. Our results show that bias correcting the global model output leads to a bias-corrected downscaled forecast. However, an alternate approach is to use the global model output as-is, and bias correct the downscaled forecast output. We found that both methods performed similarly in terms of predicting interannual variability, though slightly higher anomaly correlations for some variables were apparent in the forecasts with bias corrected forcing (Fig 11), which were also initialized from a high-resolution analysis rather than interpolating from the global model. When using this initialization approach, global model output should be bias corrected before running the regional model, so that the boundary conditions are consistent with the initial conditions.

We suggest that physical skill evaluations like the one presented here be used to assess a priori the prospects of living marine resource forecasts. For a given application, one can first identify the relevant physical variables, region, and time of year for which forecasts are needed. If these parameters of the ecological forecast align with physical forecast skill, a first condition has been satisfied for a successful prediction. Of course, this assessment is simplified (for example, not explicitly addressing whether an application requires high-resolution spatially resolved conditions or broader area averages; Table 1); however, when trying to identify forecast applications that are likely to be successful, assessments like these provide important insight with relatively little additional investment. According to this physical forecast assessment, the SSH- and SST-based applications (sardine distribution and recruitment, TOTAL, HCI) are the most promising of those explored here. However, when multiple physical variables are used in an application, not all necessarily need to be predicted skillfully in order to have a useful forecast. For example, a species distribution model built on SST and MLD may derive forecast skill from SST even if MLD anomalies are not predictable. Furthermore, while all variables have been weighted equally in our assessment, they do not contribute equally in the ecological models, where variables such as SST and buoyancy frequency often have relatively high importance. In the examples shown here, albacore distribution, AEI, EcoCast, and WhaleWatch may all derive viable forecast skill from a smaller suite of variables (SST, SSH, buoyancy frequency) despite being built with additional variables that are less predictable.

Of course, successfully predicting physical variability does not guarantee the skill of an ecological forecast, which also depends on the robustness of physical-biological relationships. Such a consideration is exemplified by sardine recruitment, for which the relevant SST forecasts are skillful but the relationship between SST and recruitment has not been robust over time [71]. In some cases, addition of biogeochemistry to the regional forecasts [9] may expand the list of potential ecological applications (e.g., with bottom oxygen concentration for groundfish models or chlorophyll concentration for pelagic species) and allow for more mechanistic underpinning of biological-environmental relationships.

Lastly, an important topic of exploration is the added value of regional downscaling relative to existing global forecasts, from climate models as well as statistical methods [e.g., 63]. While the potential of global forecasts for LMR management has been demonstrated [e.g., 50], there is a desire to resolve ocean variability on fine scales that are known to influence living marine resources. However, regional downscaling comes with considerable computational expense and labor. As a result, downscaled ensembles tend to be relatively small (e.g., 3–6 members) with sparse availability especially in real time, while global forecasts are operationally produced with ensemble sizes of order 100. Given the prominent role of internal climate variability on seasonal forecast timescales, the ability of large global ensembles to adequately represent uncertainty may outweigh the benefits of higher resolution, especially when forecasting rare extreme events [72, 73]. Regional downscaling will likely be more beneficial for fine-scale features like bottom temperature over the shelf or for nearshore biogeochemical processes, and less beneficial for variables like SST that are more strongly constrained by global model forcing. Indeed, for two of the SST-based forecast tools discussed in this paper (TOTAL and HCI), the skill added by downscaling a 3-member ensemble is outweighed by the benefit of using a much larger (~70-member) ensemble of global models [74]. Similarly, the tradeoffs between global and downscaled forecasts are also likely to be regionally dependent, with skill in some regions being more sensitive to resolving fine-scale features, and downscaling will likely to improving some predictive mechanisms than others (e.g., coastal trapped waves vs. large-scale atmospheric anomalies). Thus, when scoping potential regional forecast applications, and when exploring regional ocean prediction more generally, particular attention should be paid to quantifying the added value of downscaling. We argue that marine ecosystem forecasting will benefit from more widely available regional ocean forecasts, but that they are not a prerequisite for many living marine resource forecasts. Rather, the development of new regional forecasting capabilities should be undertaken in parallel with the development and implementation of LMR forecasts that leverage available resources.

[END]
---
[1] Url: https://journals.plos.org/climate/article?id=10.1371/journal.pclm.0000245

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/