solarforecastarbiter.utils.compute_aggregate

solarforecastarbiter.utils.compute_aggregate(data, interval_length, interval_label, timezone, agg_func, aggregate_observations, new_index=None)[source]

Computes an aggregate quantity according to agg_func of the data. This function assumes the data has an interval_value_type of interval_mean or instantaneous and that the data interval_length is less than or equal to the aggregate interval_length. NaNs in the output are the result of missing data from an underyling observation of the aggregate.

Parameters:
  • data (dict of pandas.DataFrames) – With keys ‘observation_id’ corresponding to observation in aggregate_observations. DataFrames must have ‘value’ and ‘quality_flag’ columns.
  • interval_length (str or pandas.Timedelta) – The time between timesteps in the aggregate result.
  • interval_label (str) – Whether the timestamps in the aggregated output represent the beginning or ending of the interval
  • timezone (str) – The IANA timezone for the output index
  • agg_func (str) – The aggregation function (e.g ‘sum’, ‘mean’, ‘min’) to create the aggregate
  • aggregate_observations (tuple of dicts) – Each dict should have ‘observation_id’ (string), ‘effective_from’ (timestamp), ‘effective_until’ (timestamp or None), and ‘observation_deleted_at’ (timestamp or None) fields.
  • new_index (pandas.DatetimeIndex) – The index to resample data to. Will attempt to infer an index if not provided.
Returns:

pandas.DataFrame

  • Index is a DatetimeIndex that adheres to interval_length and interval_label
  • Columns are ‘value’, for the aggregated value according to agg_func, and ‘quality_flag’, the bitwise or of all flags in the aggregate for the interval.
  • A ‘value’ of NaN means that data from one or more observations was missing in that interval.

Raises:
  • KeyError – If data is missing a key for an observation in aggregate_obsevations
    • Or, if any DataFrames in data do not have ‘value’ or ‘quality_flag’ columns
  • ValueError – If interval_length is not a divisor of one day and an index is not provided.
    • Or, if an observation has been deleted but the data is required for the aggregate
    • Or, if interval_label is not beginning or ending
    • Or, if data is empty and an index is provided.