Metrics¶
Prediction Metrics Manager¶
This module implements
MetricsManager
,
a helper class that handles pooling of multiple record and field prediction
metrics calculators.
-
class
nupic.frameworks.opf.prediction_metrics_manager.
MetricsManager
(metricSpecs, fieldInfo, inferenceType)¶ This is a class to handle the computation of metrics properly. This class takes in an inferenceType, and it assumes that it is associcated with a single model
Parameters: - metricSpecs – (list) of
MetricSpec
objects that specify which metrics should be calculated. - fieldInfo – (list) of
FieldMetaInfo
objects. - inferenceType – (
InferenceType
value that specifies the inference type of the associated model. This affects how metrics are calculated. FOR EXAMPLE, temporal models save the inference from the previous timestep to match it to the ground truth value in the current timestep.
-
getMetricDetails
(metricLabel)¶ Gets detailed info about a given metric, in addition to its value. This may including any statistics or auxilary data that are computed for a given metric.
Parameters: metricLabel – (string) label of the given metric (see MetricSpec
)Returns: (dict) of metric information, as returned by nupic.frameworks.opf.metrics.MetricsIface.getMetric()
.
-
getMetricLabels
()¶ Returns: (list) of labels for the metrics that are being calculated
-
getMetrics
()¶ Gets the current metric values
Returns: (dict) where each key is the metric-name, and the values are it scalar value. Same as the output of update()
-
update
(results)¶ Compute the new metrics values, given the next inference/ground-truth values
Parameters: results – ( ModelResult
) object that was computed during the last iteration of the model.Returns: (dict) where each key is the metric-name, and the values are it scalar value.
- metricSpecs – (list) of
Interface¶
Metrics take the predicted and actual values and compute some metric (lower is better) which is used in the OPF for swarming (and just generally as part of the output.
One non-obvious thing is that they are computed over a fixed window size, typically something like 1000 records. So each output record will have a metric score computed over the 1000 records prior.
Example usage (hot gym example):¶
Where:
aae
: average absolute erroraltMAPE
: mean absolute percentage error but modified so you never have- divide by zero
from nupic.frameworks.opf.metrics import MetricSpec
from nupic.frameworks.opf.prediction_metrics_manager import MetricsManager
model = createOpfModel() # assuming this is done elsewhere
metricSpecs = (
MetricSpec(field='kw_energy_consumption', metric='multiStep',
inferenceElement='multiStepBestPredictions',
params={'errorMetric': 'aae', 'window': 1000, 'steps': 1}),
MetricSpec(field='kw_energy_consumption', metric='trivial',
inferenceElement='prediction',
params={'errorMetric': 'aae', 'window': 1000, 'steps': 1}),
MetricSpec(field='kw_energy_consumption', metric='multiStep',
inferenceElement='multiStepBestPredictions',
params={'errorMetric': 'altMAPE', 'window': 1000, 'steps': 1}),
MetricSpec(field='kw_energy_consumption', metric='trivial',
inferenceElement='prediction',
params={'errorMetric': 'altMAPE', 'window': 1000, 'steps': 1}),
)
metricsManager = MetricsManager(metricSpecs,
model.getFieldInfo(),
model.getInferenceType()
)
for row in inputData: # this is just pseudocode
result = model.run(row)
metrics = metricsManager.update(result)
# You can collect metrics here, or attach to your result object.
result.metrics = metrics
See getModule()
for a mapping of available metric identifiers to their
implementation classes.
-
class
nupic.frameworks.opf.metrics.
MetricsIface
(metricSpec)¶ A Metrics module compares a prediction Y to corresponding ground truth X and returns a single measure representing the “goodness” of the prediction. It is up to the implementation to determine how this comparison is made.
Parameters: metricSpec – ( MetricSpec
) spec used to created the metric-
addInstance
(groundTruth, prediction, record=None, result=None)¶ Add one instance consisting of ground truth and a prediction.
Parameters: - groundTruth – The actual measured value at the current timestep
- prediction – The value predicted by the network at the current timestep
- record – the raw input record as fed to
run()
by the user. The typical usage is to feed a record to that method and get aModelResult
. Then you passModelResult
.rawInput into this function as the record parameter. - result – (
ModelResult
) the result of running a row of data through an OPF model
Returns: The average error as computed over the metric’s window size
-
getMetric
()¶ stats
is expected to contain further information relevant to the given metric, for example the number of timesteps represented in the current measurement. All stats are implementation defined, andstats
can beNone
.Returns: (dict) representing data from the metric {value : <current measurement>, "stats" : {<stat> : <value> ...}}
-
-
class
nupic.frameworks.opf.metrics.
MetricSpec
(metric, inferenceElement, field=None, params=None)¶ This class represents a single Metrics specification in the TaskControl block.
Parameters: - metric – (string) A metric type name that identifies which metrics
module is to be constructed by
nupic.frameworks.opf.metrics.getModule()
; e.g.,rmse
- inferenceElement – (
InferenceElement
) Some inference types (such as classification), can output more than one type of inference (i.e. the predicted class AND the predicted next step). This field specifies which of these inferences to compute the metrics on. - field – (string) Field name on which this metric is to be collected
- params – (dict) Custom parameters for the metrics module’s constructor
-
classmethod
getInferenceTypeFromLabel
(label)¶ Extracts the PredictionKind (temporal vs. nontemporal) from the given metric label.
Parameters: label – (string) for a metric spec generated by getMetricLabel()
Returns: ( InferenceType
)
-
getLabel
(inferenceType=None)¶ Helper method that generates a unique label for a
MetricSpec
/InferenceType
pair. The label is formatted as follows:<predictionKind>:<metric type>:(paramName=value)*:field=<fieldname>
For example:
classification:aae:paramA=10.2:paramB=20:window=100:field=pounds
Returns: (string) label for inference type
- metric – (string) A metric type name that identifies which metrics
module is to be constructed by
-
class
nupic.frameworks.opf.metrics.
CustomErrorMetric
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.MetricsIface
Custom Error Metric class that handles user defined error metrics.
-
class
CircularBuffer
(length)¶ implementation of a fixed size constant random access circular buffer
-
CustomErrorMetric.
expValue
(pred)¶ Helper function to return a scalar value representing the expected value of a probability distribution
-
CustomErrorMetric.
mostLikely
(pred)¶ Helper function to return a scalar value representing the most likely outcome given a probability distribution
-
class
-
class
nupic.frameworks.opf.metrics.
MetricSpec
(metric, inferenceElement, field=None, params=None) This class represents a single Metrics specification in the TaskControl block.
Parameters: - metric – (string) A metric type name that identifies which metrics
module is to be constructed by
nupic.frameworks.opf.metrics.getModule()
; e.g.,rmse
- inferenceElement – (
InferenceElement
) Some inference types (such as classification), can output more than one type of inference (i.e. the predicted class AND the predicted next step). This field specifies which of these inferences to compute the metrics on. - field – (string) Field name on which this metric is to be collected
- params – (dict) Custom parameters for the metrics module’s constructor
-
classmethod
getInferenceTypeFromLabel
(label) Extracts the PredictionKind (temporal vs. nontemporal) from the given metric label.
Parameters: label – (string) for a metric spec generated by getMetricLabel()
Returns: ( InferenceType
)
-
getLabel
(inferenceType=None) Helper method that generates a unique label for a
MetricSpec
/InferenceType
pair. The label is formatted as follows:<predictionKind>:<metric type>:(paramName=value)*:field=<fieldname>
For example:
classification:aae:paramA=10.2:paramB=20:window=100:field=pounds
Returns: (string) label for inference type
- metric – (string) A metric type name that identifies which metrics
module is to be constructed by
-
class
nupic.frameworks.opf.metrics.
AggregateMetric
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.MetricsIface
Partial implementation of Metrics Interface for metrics that accumulate an error and compute an aggregate score, potentially over some window of previous data. This is a convenience class that can serve as the base class for a wide variety of metrics.
-
accumulate
(groundTruth, prediction, accumulatedError, historyBuffer, result)¶ Updates the accumulated error given the prediction and the ground truth.
Parameters: - groundTruth – Actual value that is observed for the current timestep
- prediction – Value predicted by the network for the given timestep
- accumulatedError – The total accumulated score from the previous predictions (possibly over some finite window)
- historyBuffer –
A buffer of the last <self.window> ground truth values that have been observed.
If historyBuffer = None, it means that no history is being kept.
- result – An ModelResult class (see opf_utils.py), used for advanced metric calculation (e.g., MetricNegativeLogLikelihood)
Returns: The new accumulated error. That is:
self.accumulatedError = self.accumulate( groundTruth, predictions, accumulatedError )
historyBuffer
should also be updated in this method.self.spec.params["window"]
indicates the maximum size of the window.
-
aggregate
(accumulatedError, historyBuffer, steps)¶ Updates the final aggregated score error given the prediction and the ground truth.
Parameters: - accumulatedError – The total accumulated score from the previous predictions (possibly over some finite window)
- historyBuffer – A buffer of the last <self.window> ground truth values
that have been observed. If
historyBuffer
= None, it means that no history is being kept. - steps – (int) The total number of (groundTruth, prediction) pairs that
have been passed to the metric. This does not include pairs where
groundTruth = SENTINEL_VALUE_FOR_MISSING_DATA
Returns: The new aggregate (final) error measure.
-
Helpers¶
-
metrics.
getModule
(metricSpec)¶ Factory method to return an appropriate
MetricsIface
module.rmse
:MetricRMSE
nrmse
:MetricNRMSE
aae
:MetricAAE
acc
:MetricAccuracy
avg_err
:MetricAveError
trivial
:MetricTrivial
two_gram
:MetricTwoGram
moving_mean
:MetricMovingMean
moving_mode
:MetricMovingMode
neg_auc
:MetricNegAUC
custom_error_metric
:CustomErrorMetric
multiStep
:MetricMultiStep
ms_aae
:MetricMultiStepAAE
ms_avg_err
:MetricMultiStepAveError
passThruPrediction
:MetricPassThruPrediction
altMAPE
:MetricAltMAPE
MAPE
:MetricMAPE
multi
:MetricMulti
negativeLogLikelihood
:MetricNegativeLogLikelihood
Parameters: metricSpec – ( MetricSpec
) metric to find module for.metricSpec.metric
must be in the list above.Returns: ( AggregateMetric
) an appropriate metric module
Available Metrics¶
-
class
nupic.frameworks.opf.metrics.
MetricNegativeLogLikelihood
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes negative log-likelihood. Likelihood is the predicted probability of the true data from a model. It is more powerful than metrics that only considers the single best prediction (e.g. MSE) as it considers the entire probability distribution predicted by a model.
It is more appropriate to use likelihood as the error metric when multiple predictions are possible.
-
class
nupic.frameworks.opf.metrics.
MetricRMSE
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes root-mean-square error.
-
class
nupic.frameworks.opf.metrics.
MetricNRMSE
(*args, **kwargs)¶ Bases:
nupic.frameworks.opf.metrics.MetricRMSE
Computes normalized root-mean-square error.
-
class
nupic.frameworks.opf.metrics.
MetricAAE
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes average absolute error.
-
class
nupic.frameworks.opf.metrics.
MetricAltMAPE
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes the “Alternative” Mean Absolute Percent Error.
A generic MAPE computes the percent error for each sample, and then gets an average. This can suffer from samples where the actual value is very small or zero - this one sample can drastically alter the mean.
This metric on the other hand first computes the average of the actual values and the averages of the errors before dividing. This washes out the effects of a small number of samples with very small actual values.
-
class
nupic.frameworks.opf.metrics.
MetricMAPE
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes the “Classic” Mean Absolute Percent Error.
This computes the percent error for each sample, and then gets an average. Note that this can suffer from samples where the actual value is very small or zero - this one sample can drastically alter the mean. To avoid this potential issue, use ‘altMAPE’ instead.
This metric is provided mainly as a convenience when comparing results against other investigations that have also used MAPE.
-
class
nupic.frameworks.opf.metrics.
MetricPassThruPrediction
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.MetricsIface
This is not a metric, but rather a facility for passing the predictions generated by a baseline metric through to the prediction output cache produced by a model.
For example, if you wanted to see the predictions generated for the TwoGram metric, you would specify ‘PassThruPredictions’ as the ‘errorMetric’ parameter.
This metric class simply takes the prediction and outputs that as the aggregateMetric value.
-
class
nupic.frameworks.opf.metrics.
MetricMovingMean
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes error metric based on moving mean prediction.
-
class
nupic.frameworks.opf.metrics.
MetricMovingMode
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes error metric based on moving mode prediction.
-
class
nupic.frameworks.opf.metrics.
MetricTrivial
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes a metric against the ground truth N steps ago. The metric to compute is designated by the
errorMetric
entry in the metric params.
-
class
nupic.frameworks.opf.metrics.
MetricTwoGram
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes error metric based on one-grams. The groundTruth passed into this metric is the encoded output of the field (an array of 1’s and 0’s).
-
class
nupic.frameworks.opf.metrics.
MetricAccuracy
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes simple accuracy for an enumerated type. all inputs are treated as discrete members of a set, therefore for example 0.5 is only a correct response if the ground truth is exactly 0.5. Inputs can be strings, integers, or reals.
-
class
nupic.frameworks.opf.metrics.
MetricAveError
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Simply the inverse of the Accuracy metric. More consistent with scalar metrics because they all report an error to be minimized.
-
class
nupic.frameworks.opf.metrics.
MetricNegAUC
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
Computes -1 * AUC (Area Under the Curve) of the ROC (Receiver Operator Characteristics) curve. We compute -1 * AUC because metrics are optimized to be LOWER when swarming.
For this, we assuming that category 1 is the “positive” category and we are generating an ROC curve with the TPR (True Positive Rate) of category 1 on the y-axis and the FPR (False Positive Rate) on the x-axis.
-
accumulate
(groundTruth, prediction, accumulatedError, historyBuffer, result=None)¶ Accumulate history of groundTruth and “prediction” values.
For this metric, groundTruth is the actual category and “prediction” is a dict containing one top-level item with a key of 0 (meaning this is the 0-step classificaton) and a value which is another dict, which contains the probability for each category as output from the classifier. For example, this is what “prediction” would be if the classifier said that category 0 had a 0.6 probability and category 1 had a 0.4 probability: {0:0.6, 1: 0.4}
-
-
class
nupic.frameworks.opf.metrics.
MetricMultiStep
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
This is an “uber” metric which is used to apply one of the other basic metrics to a specific step in a multi-step prediction.
The specParams are expected to contain:
errorMetric
: name of basic metric to applysteps
: compare prediction[‘steps’] to the current ground truth.
Note that the metrics manager has already performed the time shifting for us - it passes us the prediction element from ‘steps’ steps ago and asks us to compare that to the current ground truth.
When multiple steps of prediction are requested, we average the results of the underlying metric for each step.
-
class
nupic.frameworks.opf.metrics.
MetricMultiStepProbability
(metricSpec)¶ Bases:
nupic.frameworks.opf.metrics.AggregateMetric
This is an “uber” metric which is used to apply one of the other basic metrics to a specific step in a multi-step prediction.
The specParams are expected to contain:
errorMetric
: name of basic metric to applysteps
: compare prediction[‘steps’] to the current ground truth.
Note that the metrics manager has already performed the time shifting for us - it passes us the prediction element from ‘steps’ steps ago and asks us to compare that to the current ground truth.
-
class
nupic.frameworks.opf.metrics.
MetricMulti
(weights, metrics, window=None)¶ Bases:
nupic.frameworks.opf.metrics.MetricsIface
Multi metric can combine multiple other (sub)metrics and weight them to provide combined score.