solarforecastarbiter.metrics.probabilistic.continuous_ranked_probability_score¶

solarforecastarbiter.metrics.probabilistic.continuous_ranked_probability_score(obs, fx, fx_prob)[source]¶

Continuous Ranked Probability Score (CRPS).

\[\text{CRPS} = \frac{1}{n} \sum_{i=1}^n \int_{-\infty}^{\infty} (F_i(x) - \mathbf{1} \{x \geq y_i \})^2 dx\]

where \(F_i(x)\) is the CDF of the forecast at time \(i\), \(y_i\) is the observation at time \(i\), and \(\mathbf{1}\) is the indicator function that transforms the observation into a step function (1 if \(x \geq y\), 0 if \(x < y\)). In other words, the CRPS measures the difference between the forecast CDF and the empirical CDF of the observation. The CRPS has the same units as the observation. Lower CRPS values indicate more accurate forecasts, where a CRPS of 0 indicates a perfect forecast. [1] [2] [3]

Parameters:	obs ((n,) array_like) – Observations (physical unit). fx ((n, d) array_like) – Forecasts (physical units) of the right-hand-side of a CDF with d intervals (d >= 2), e.g., fx = [10 MW, 20 MW, 30 MW] is interpreted as <= 10 MW, <= 20 MW, <= 30 MW. fx_prob ((n, d) array_like) – Probability [%] associated with the forecasts.
Returns:	crps (float) – The Continuous Ranked Probability Score, with the same units as the observation.
Raises:	`ValueError` – If the forecasts have incorrect dimensions; either a) the forecasts are for a single sample (n=1) with d CDF intervals but are given as a 1D array with d values or b) the forecasts are given as 2D arrays (n,d) but do not contain at least 2 CDF intervals (i.e. d < 2).

Notes

The CRPS can be calculated analytically when the forecast CDF is of a continuous parametric distribution, e.g., Gaussian distribution. However, since the Solar Forecast Arbiter makes no assumptions regarding how a probabilistic forecast was generated, the CRPS is instead calculated using numerical integration of the discretized forecast CDF. Therefore, the accuracy of the CRPS calculation is limited by the precision of the forecast CDF. In practice, this means the forecast CDF should 1) consist of at least 10 intervals and 2) cover probabilities from 0% to 100%.

References

[1]	Matheson and Winkler (1976) “Scoring rules for continuous probability distributions.” Management Science, vol. 22, pp. 1087-1096. doi: 10.1287/mnsc.22.10.1087

[2]	Hersbach (2000) “Decomposition of the continuous ranked probability score for ensemble prediction systems.” Weather Forecast, vol. 15, pp. 559-570. doi: 10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2

[3]	Wilks (2019) “Statistical Methods in the Atmospheric Sciences”, 4th ed. Oxford; Waltham, MA; Academic Press.