solarforecastarbiter.metrics.preprocessing.process_forecast_observations

solarforecastarbiter.metrics.preprocessing.process_forecast_observations(forecast_observations, filters, forecast_fill_method, start, end, data, timezone, costs=(), outages=())[source]

Convert ForecastObservations into ProcessedForecastObservations applying any filters and resampling to align forecast and observation.

Parameters:
  • forecast_observations (list of solarforecastarbiter.datamodel.ForecastObservation, solarforecastarbiter.datamodel.ForecastAggregate) – Pairs to process
  • filters (list of solarforecastarbiter.datamodel.BaseFilter) – Filters to apply to each pair.
  • forecast_fill_method (str) – Indicates what process to use for handling missing forecasts. Currently supports : ‘drop’, ‘forward’, and bool or numeric value.
  • start (pandas.Timestamp) – Start date and time for assessing forecast performance.
  • end (pandas.Timestamp) – End date and time for assessing forecast performance.
  • data (dict) – Dict with keys that are the Forecast/Observation/Aggregate object and values that are the corresponding pandas.Series/DataFrame for the object. Keys must also include all Forecast objects assigned to the reference_forecast attributes of the forecast_observations.
  • timezone (str) – Timezone that data should be converted to
  • costs (tuple of solarforecastarbiter.datamodel.Cost) – Costs that are referenced by any pairs. Pairs and costs are matched by the Cost name.
  • outages (tuple of solarforecastarbiter.datamodel.TimePeriod) – Tuple of time periods during which forecast submissions will be excluded from analysis.
Returns:

tuple of ProcessedForecastObservation

Notes

In the case where the interval_label of the obs and fx do not match, this function currently returns a ProcessedForecastObservation object with a interval_label the same as the fx, regardless of whether the interval_length of the fx and obs are the same or different.

The processing logic is as follows. For each forecast, observation pair in forecast_observations:

  1. Fill missing forecast data points according to forecast_fill_method.
  2. Remove any forecast points associated with an outage.
  3. Fill missing reference forecast data points according to forecast_fill_method.
  4. Remove any reference forecast or observation points associated with an outage.
  5. Remove observation data points with quality_flag in filters. Remaining observation series is discontinuous.
  6. Resample observations to match forecast intervals. If at least 10% of the observation intervals within a forecast interval are valid (not missing or matching filters), the interval is value is computed from all subintervals. Otherwise the resampled observation is NaN.
  7. Drop NaN observation values.
  8. Align observations to match forecast times. Observation times for which there is not a matching forecast time are dropped on a forecast by forecast basis.
  9. Create ProcessedForecastObservation with resampled, aligned data and metadata.