solarforecastarbiter.metrics.preprocessing.process_forecast_observations¶

solarforecastarbiter.metrics.preprocessing.
process_forecast_observations
(forecast_observations, filters, forecast_fill_method, start, end, data, timezone, costs=(), outages=())[source]¶ Convert ForecastObservations into ProcessedForecastObservations applying any filters and resampling to align forecast and observation.
Parameters:  forecast_observations (list of solarforecastarbiter.datamodel.ForecastObservation, solarforecastarbiter.datamodel.ForecastAggregate) – Pairs to process
 filters (list of solarforecastarbiter.datamodel.BaseFilter) – Filters to apply to each pair.
 forecast_fill_method (str) – Indicates what process to use for handling missing forecasts. Currently supports : ‘drop’, ‘forward’, and bool or numeric value.
 start (pandas.Timestamp) – Start date and time for assessing forecast performance.
 end (pandas.Timestamp) – End date and time for assessing forecast performance.
 data (dict) – Dict with keys that are the Forecast/Observation/Aggregate object
and values that are the corresponding pandas.Series/DataFrame for
the object. Keys must also include all Forecast objects assigned
to the
reference_forecast
attributes of theforecast_observations
.  timezone (str) – Timezone that data should be converted to
 costs (tuple of
solarforecastarbiter.datamodel.Cost
) – Costs that are referenced by any pairs. Pairs and costs are matched by the Cost name.  outages (tuple of
solarforecastarbiter.datamodel.TimePeriod
) – Tuple of time periods during which forecast submissions will be excluded from analysis.
Returns: tuple of ProcessedForecastObservation
Notes
In the case where the interval_label of the obs and fx do not match, this function currently returns a ProcessedForecastObservation object with a interval_label the same as the fx, regardless of whether the interval_length of the fx and obs are the same or different.
The processing logic is as follows. For each forecast, observation pair in
forecast_observations
: Fill missing forecast data points according to
forecast_fill_method
.  Remove any forecast points associated with an outage.
 Fill missing reference forecast data points according to
forecast_fill_method
.  Remove any reference forecast or observation points associated with an outage.
 Remove observation data points with
quality_flag
in filters. Remaining observation series is discontinuous.  Resample observations to match forecast intervals. If at least
10% of the observation intervals within a forecast interval are
valid (not missing or matching
filters
), the interval is value is computed from all subintervals. Otherwise the resampled observation is NaN.  Drop NaN observation values.
 Align observations to match forecast times. Observation times for which there is not a matching forecast time are dropped on a forecast by forecast basis.
 Create
ProcessedForecastObservation
with resampled, aligned data and metadata.