solarforecastarbiter.metrics.preprocessing.process_forecast_observations¶
-
solarforecastarbiter.metrics.preprocessing.
process_forecast_observations
(forecast_observations, filters, forecast_fill_method, start, end, data, timezone, costs=(), outages=())[source]¶ Convert ForecastObservations into ProcessedForecastObservations applying any filters and resampling to align forecast and observation.
Parameters: - forecast_observations (list of solarforecastarbiter.datamodel.ForecastObservation, solarforecastarbiter.datamodel.ForecastAggregate) – Pairs to process
- filters (list of solarforecastarbiter.datamodel.BaseFilter) – Filters to apply to each pair.
- forecast_fill_method (str) – Indicates what process to use for handling missing forecasts. Currently supports : ‘drop’, ‘forward’, and bool or numeric value.
- start (pandas.Timestamp) – Start date and time for assessing forecast performance.
- end (pandas.Timestamp) – End date and time for assessing forecast performance.
- data (dict) – Dict with keys that are the Forecast/Observation/Aggregate object
and values that are the corresponding pandas.Series/DataFrame for
the object. Keys must also include all Forecast objects assigned
to the
reference_forecast
attributes of theforecast_observations
. - timezone (str) – Timezone that data should be converted to
- costs (tuple of
solarforecastarbiter.datamodel.Cost
) – Costs that are referenced by any pairs. Pairs and costs are matched by the Cost name. - outages (tuple of
solarforecastarbiter.datamodel.TimePeriod
) – Tuple of time periods during which forecast submissions will be excluded from analysis.
Returns: tuple of ProcessedForecastObservation
Notes
In the case where the interval_label of the obs and fx do not match, this function currently returns a ProcessedForecastObservation object with a interval_label the same as the fx, regardless of whether the interval_length of the fx and obs are the same or different.
The processing logic is as follows. For each forecast, observation pair in
forecast_observations
:- Fill missing forecast data points according to
forecast_fill_method
. - Remove any forecast points associated with an outage.
- Fill missing reference forecast data points according to
forecast_fill_method
. - Remove any reference forecast or observation points associated with an outage.
- Remove observation data points with
quality_flag
in filters. Remaining observation series is discontinuous. - Resample observations to match forecast intervals. If at least
10% of the observation intervals within a forecast interval are
valid (not missing or matching
filters
), the interval is value is computed from all subintervals. Otherwise the resampled observation is NaN. - Drop NaN observation values.
- Align observations to match forecast times. Observation times for which there is not a matching forecast time are dropped on a forecast by forecast basis.
- Create
ProcessedForecastObservation
with resampled, aligned data and metadata.