solarforecastarbiter.utils.compute_aggregate¶

solarforecastarbiter.utils.compute_aggregate(data, interval_length, interval_label, timezone, agg_func, aggregate_observations, new_index=None)[source]¶

Computes an aggregate quantity according to agg_func of the data. This function assumes the data has an interval_value_type of interval_mean or instantaneous and that the data interval_length is less than or equal to the aggregate interval_length. NaNs in the output are the result of missing data from an underyling observation of the aggregate.

Parameters:

data (dict of pandas.DataFrames) – With keys ‘observation_id’ corresponding to observation in aggregate_observations. DataFrames must have ‘value’ and ‘quality_flag’ columns.
interval_length (str or pandas.Timedelta) – The time between timesteps in the aggregate result.
interval_label (str) – Whether the timestamps in the aggregated output represent the beginning or ending of the interval
timezone (str) – The IANA timezone for the output index
agg_func (str) – The aggregation function (e.g ‘sum’, ‘mean’, ‘min’) to create the aggregate
aggregate_observations (tuple of dicts) – Each dict should have ‘observation_id’ (string), ‘effective_from’ (timestamp), ‘effective_until’ (timestamp or None), and ‘observation_deleted_at’ (timestamp or None) fields.
new_index (pandas.DatetimeIndex) – The index to resample data to. Will attempt to infer an index if not provided.

Returns:

pandas.DataFrame –

Index is a DatetimeIndex that adheres to interval_length and interval_label
Columns are ‘value’, for the aggregated value according to agg_func, and ‘quality_flag’, the bitwise or of all flags in the aggregate for the interval.
A ‘value’ of NaN means that data from one or more observations was missing in that interval.

Raises:

KeyError – If data is missing a key for an observation in aggregate_obsevations
- Or, if any DataFrames in data do not have ‘value’ or ‘quality_flag’ columns
ValueError – If interval_length is not a divisor of one day and an index is not provided.
- Or, if an observation has been deleted but the data is required for the aggregate
- Or, if interval_label is not beginning or ending
- Or, if data is empty and an index is provided.