Analytics plugin API
The analytics plugin API is defined in the orcm/mca/analytics/analytics\_interface.h
header file. The Analytics
class is the abstract base class that should be extended and implemented by all the newly-developed plugins. The only "pure virtual" API function is the analyze
function that is defined as follows:
int analyze(DataSet& data_set);
Synopsis:
Conduct data analytics for the input data: data_set. Typical analytics are data aggregation computations, such as average, max, min, and standard deviation. These computations may be done for all the data samples from the beginning of sampling (i.e. running), or in a sliding window period of sampling, or in a spatial rack of all the compute nodes. Another data analytics would be low/up bound thresholds checking. The DataSet
is a data struct that is defined as follows:
struct DataSet {
DataSet(dataContainer& _results, std::list<Event>& _events) :
results(_results), events(_events) {
}
dataContainer attributes;
dataContainer key;
dataContainer non_compute;
dataContainer compute;
dataContainer &results;
std::list<Event> &events;
};
The dataContainer
is a hash map of <key, value> pairs. The key
of an item in the hash map uniquely identifies the data item, and the value
is the actual data with an unit associated with the key
. For instance, one data item could be the coretemp of core 1 of node 1 in celsius (C).
The specification of the DataSet
structure is as follows:
-
attributes: an input parameter that specifies the plugin's measurement criteria, such as "window size" and "high" threshold value.
-
key, noncompute, compute: input parameters that correspond to the three lists of analytics data format.
-
results, events: output parameters that represent the computational results (e.g. average value) and the generated events (e.g. core temperature is higher than the threshold), respectively. The
results
andevents
might be empty parameters. For instance, in a computational plugin, there maybe no events.
The Event
data structure is defined as follows:
struct Event {
const std::string severity;
const std::string action;
const std::string msg;
const double value;
const std::string unit;
};
-
severity: indicates how severe the event is, such as "crit" and "warn".
-
action: specifies the action to be taken for the event. Possible actions may be launching a script, sending an email, or writing to syslog.
-
msg: is the descriptive information of the event. An example would be "the temperature of core 1 of node 1 is higher than 100C degree".
-
value: the numeric value that triggers the event.
-
unit: the unit associated with the value.