# Calculating Costs¶

## Overview¶

The Solar Forecast Arbiter includes functionality to calculate the cost of forecast errors. Error in this context refers to the deviation of forecasted values from observed values (possible including a deadband) and not a specific error metric e.g. MBE, RMSE. This page explains the motivation for and structure of the cost calculation functionality.

Basic costs can be specified as a `constant`

cost per unit error, a cost per unit error that varies by
`time of day`

, or a cost per per unit error
that varies by `date-time`

. Additionally, an
`error band`

cost that specifies one of the
aforementioned basic costs depending on the size of the error is
implemented. This banded cost allows one to specify a cost similar to
charges from transmission generator imbalance service as described in
FERC Order 890-B.
Examples are provided below.

Most cost models allow the specification of an aggregation and net
parameters. The aggregation parameter controls how the cost for each
error value in the timeseries are aggregated (e.g. summed or averaged) into a single cost
number. The net parameter is a boolean that indicates if the
aggregation should keep the sign of the error, or take the absolute
value of the error before aggregating. Note that when ```
net ==
True
```

and the cost per unit error is positive, it is possible to
calculate a final cost that is negative.

## Basic Cost Models¶

### Constant¶

The constant cost model parameters are defined by
`ConstantCost`

and implemented for deterministic forecasts
by `solarforecastarbiter.metrics.deterministic.constant_cost()`

.
This model expects a cost parameter with units of $ per unit error.
Thus, if one were comparing an AC power forecasts to observations, this
cost would be assumed to have units of $/MW. One could scale this cost
based on the forecast interval length to mimic a cost per MWh.

`ConstantCost`

also expects aggregation and net
parameters. aggregation defines how costs are aggregated for a given
analysis time period (as specified in the report). Options include
sum and mean. The net parameter is a boolean that specifies
whether or not the the sum/mean should be taken without (True) or with
(False) the absolute value of the error.

An example of a cost that would take the mean of error values after taking the absolute value and applying a cost of $2.5/unit error is

```
from solarforecastarbiter import datamodel
cost_model = solarforecastarbiter.datamodel.ConstantCost(
cost=2.5,
aggregation='mean',
net=False
)
```

### Time of Day¶

The time-of-day cost model parameters are defined using
`TimeOfDayCost`

and implemented for deterministic forecasts
by
`solarforecastarbiter.metrics.deterministic.time_of_day_cost()`

.
Similar to the constant cost, the datamodel expects aggregation and
net parameters. In this case, cost is an iterable of cost values
that are paired with each time given by times. The fill parameter
specifies how the costs should be extended to times that are not
included in times. Options for fill include ‘forward’ and
‘backward’. This filling parameter also controls how values “wrap
around” midnight. For example, for a cost describing different costs
depending on a evening peak,

```
import datetime
from solarforecastarbiter import datamodel
cost_model = datamodel.TimeOfDayCost(
cost=[3.3, 1.2],
times=[datetime.time(hour=15), datetime.time(hour=20)],
net=True,
aggregation='sum',
fill='forward',
)
```

the value of $3.3 / unit error applies from 15:00 to just before
20:00, and the value of $1.2 / unit error applies for all other times
in the day *except* 15:00 to 20:00. The timezone parameter defines
the timezone the times are referenced in. If timezone is None,
times is assumed to be in same timezone as the errors.

### Date-time Cost¶

The date-time cost model is defined using `DatetimeCost`

and implemented for deterministic forecasts by
`solarforecastarbiter.metrics.deterministic.datetime_cost()`

. Similar
to the time of day cost, the datamodel expects aggregation, net,
and fill parameters. In this case cost values are associated with
each date-time specified in datetimes. The timezone parameter
defines the timezone if datetimes are not localized, and if
timezone is None, the timezone of the errors is used.

The minimum/maximum bounds of datetimes should cover the range of date-times that one wants to evaluate. For example, when evaluating the cost defined by

```
import datetime
from solarforecastarbiter import datamodel
cost_model = datamodel.DatetimeCost(
cost=[1.3, 1.9, 0.9, 2.0],
times=[datetime.datetime(2020, 5, 1, 12, 0),
datetime.datetime(2020, 5, 2, 12, 0),
datetime.datetime(2020, 5, 3, 12, 0),
datetime.datetime(2020, 5, 4, 12, 0)],
net=True,
aggregation='sum',
fill='forward',
timezone='UTC'
)
```

errors in the timeseries before 2020-05-01T12:00 are not included in the final calculation.

## Error Band Cost¶

The error band cost model is defined using `ErrorBandCost`

and implemented for deterministic forecasts by
`solarforecastarbiter.metrics.deterministic.error_band_cost()`

.
Each of bands is a `CostBand`

that describes the range of
errors the band applies to and one of the cost models
above. For example,

```
import datetime
from solarforecastarbiter import datamodel
cost_model = datamodel.ErrorBandCost(
bands=[
datamodel.CostBand(
error_range=(-5.0, 20.5),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=33.0,
net=True,
aggregation='sum'
)
),
datamodel.CostBand(
error_range=(20.5, float('inf')),
cost_function='timeofday'
cost_function_parameters=datamodel.TimeOfDayCost(
cost=[3.3, 1.2],
times=[datetime.time(hour=15), datetime.time(hour=20)],
net=True,
aggregation='sum',
fill='forward'
)
)
]
)
```

defines a cost that will apply a constant cost of $33.0 / unit error for all errors in the range [-5.0, 20.5]. For errors > 20.5, the time of day cost applies. The errors within each band are aggregated according to the aggregation and net parameter of the band parameters, but the total cost is the sum of all error bands.

Band error ranges are evaluated in the order specified and any errors
outside the list of ranges *are not evaluated*. Thus, for the model
described by

```
from solarforecastarbiter import datamodel
cost_model = datamodel.ErrorBandCost(
bands=[
datamodel.CostBand(
error_range=(-5.0, 5.0),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=2.0,
net=True,
aggregation='mean'
)
),
datamodel.CostBand(
error_range=(-10.0, 10.0),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=4.0,
net=True,
aggregation='sum'
)
)
]
)
```

errors in the range [-5, 5] have cost of $2.0 / unit error. Errors that are outside [-5, 5] but within [-10, 10], that is errors in the range [-10, 5) or (5, 10] have a cost of $4.0 / unit error. Errors outside the range of [-10, 10] are not evaluated at all and have an effective cost of $0 / unit error. Therefore, most use cases should specify -Inf and Inf in the error ranges to ensure all errors have some cost assigned to them.

The above model is equivalent to

```
from solarforecastarbiter import datamodel
cost_model = datamodel.ErrorBandCost(
bands=[
datamodel.CostBand(
error_range=(-5.0, 5.0),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=2.0,
net=True,
aggregation='mean'
)
),
datamodel.CostBand(
error_range=(-10.0, 5.0),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=4.0,
net=True,
aggregation='sum'
)
),
datamodel.CostBand(
error_range=(5.0, 10.0),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=4.0,
net=True,
aggregation='sum'
)
)
]
)
```

It is especially important to consider the sign of the cost parameter and the value of net when using the error band cost. For example,

```
from solarforecastarbiter import datamodel
cost_model = datamodel.ErrorBandCost(
bands=[
datamodel.CostBand(
error_range=(float('-inf'), 0),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=2.0,
net=True,
aggregation='sum'
)
),
datamodel.CostBand(
error_range=(0, float(inf)),
cost_function='constant'
cost_function_parameters=datamodel.ConstantCost(
cost=0,
net=True,
aggregation='sum'
)
)
]
)
```

will always result in a negative (or 0) cost because the net parameter of the first error band is True (so no absolute value is taken) and the cost factor 2.0 will therefore multiply negative values that are summed. This model is consistent with a contract where a generator is paid some additional amount if it overproduces and is not penalized for underproducing. A negative cost value in the first error band in this case would penalize the producer for overproducing compared to the forecast.

Finally, to implement a cost similar to charges from transmission generator imbalance service as described in FERC Order 890-B, one might define a cost model like

```
import datetime
from solarforecastarbiter import datamodel
cost_model = datamodel.ErrorBandCost(
bands=[
datamodel.CostBand(
error_range=(-2, 2),
cost_function='constant',
cost_function_parameters=datamodel.ConstantCost(
cost=1.0,
net=True,
aggregation='sum'
)
),
datamodel.CostBand(
error_range=(float('-inf'), -2),
cost_function='timeofday'
cost_function_parameters=datamodel.TimeOfDayCost(
cost=[5.1, 0.3], # decremental cost
times=[datetime.time(16, 0), datetime.time(19, 0)],
net=False,
aggregation='sum',
fill='forward'
)
),
datamodel.CostBand(
error_range=(2, float('inf')),
cost_function='timeofday'
cost_function_parameters=datamodel.TimeOfDayCost(
cost=[7.1, 1.4], # incremental cost
times=[datetime.time(16, 0), datetime.time(19, 0)],
net=False,
aggregation='sum',
fill='forward'
)
)
]
)
```

If this cost model is used to evaluate an hourly, mean AC power forecast, errors between \(\pm 2\) MW are netted over the evaluation time period and assigned a value of $1 / MWh error. For overproduction errors over 2 MW, a decremental cost is charged/refunded based on a time of day cost. Underproduction errors over 2 MW are charged an incremental cost depending on the time of the infraction. Therefore, the total cost over the evaluation time period is the net cost of errors within \(\pm 2\) MW plus the cost of each error over \(\pm 2\) MW charged at the time the error occured and summed over the evaluation time period.