XAISuite

dataHandler

explainableModel

insightGenerator

class xaisuite.dataHandler.XAIData(X_train, X_test, y_train, y_test)

Bases: object

Class to represent data.

Parameters:
  • X_train (numpy.ndarray) – The training features

  • X_test (numpy.ndarray) – The testing features

  • y_train (numpy.ndarray) – The training targets

  • y_test (numpy.ndarray) – The testing targets

class xaisuite.dataHandler.DataLoader(data, source='auto', type='Tabular', variable_names='auto', target_names='auto', cut=None, categorical=None, **dataGenerationArgs)

Bases: object

Class that loads data from a given source

Parameters:
  • data (Union[str, Callable, numpy.ndarray, pd.DataFrame, tuple]) – The data identifier, a function that returns the data, or the data itself in the form of a numpy array, pandas DataFrame, or tuple

  • source (str, optional) – The source of the data. Either “auto”, “system”, “preloaded”, “generated”, or “url”. If “auto”, the source will be inferred based on data. By default, “auto”

  • type (str, optional) – The type of data. Either “Tabular”, “Image”, or “Text”. By default, “Tabular”. If “Text” or “Image”, only one feature is allowed.

  • variable_names (Union[str, list], optional) – The variables in the dataset excluding cut, in the order that they appear in the data. By default, set to “auto” and inferred.

  • target_names (Union[str, int], optional) – The target variable(s). By default, set to “auto” and inferred

  • cut (Union[str, list], optional) – Variables to drop from the data. By default, None.

  • categorical (Union[str, list], optional) – If type == “Tabular”, variables that contain categorical data.

  • **dataGenerationArgs – Additional arguments to pass in if data is Callable.

Raises:

ValueError – if data cannot be resolved or if invalid arguments are passed.

Class constructor

initializeDataFromSystem(id)

Initializes data from user’s system

Parameters:

id (str) – The data file path

Raises:

ValueError – if provided file path is not found or is not a file

initializeDataFromPreloaded(id)

Initializes data from preloaded sklearn datasets

Parameters:

id (str) – The name of the preloaded dataset

Raises:

ValueError – if provided preloaded data name is not found

initializeDataFromGenerated(id, **generateArgs)

Initializes data from string commands

Parameters:
  • id (str) – The string generation command

  • **generateArgs – Arguments to pass to the function represented by id

initializeDataFromUrl(id)

Initializes data from URL

Parameters:

id (str) – The url of the data

plot()

Plots loaded data.

class xaisuite.dataHandler.DataProcessor(forDataLoader, test_size=0.2, processor=None, **processorArgs)

Bases: object

Class that processes data

Parameters:
  • forDataLoader (DataLoader) – The dataloader that will be associated with this processor.

  • test_size (float, optional) – The proportion of data that will be used to test and score the machine learning model. By default, 0.2

  • processor (Any, optional) – The data processer, either a string function, or an Object with fit() and transform() methods.

  • **processorArgs – Arguments to be passed into the processor. If the argument is a function, like a component of a composite processor, pass it in as shown in this example: DataProcessor(…, target_transform = “component: KBins(n_bins = 5)”, ratio = 0.1)

class xaisuite.explainableModel.ModelTrainer(model, withData, taskType='Tabular', task='regression', explainers=None, **modelArgs)

Bases: object

Class to train an explainable machine learning model.

Parameters:
  • model (Any) – The string name of the model, the function returning the model, or the model Any itself. The model function must have fit() and predict() functions. A score() function is optional.

  • withData (DataProcessor) – The data that will be used to train and test the model

  • taskType (str, optional) – The type of task that the model performs. By default, “Tabular”. Other options are “Vision” and “NLP”

  • task (str, optional) – The task that the model performs. By default, “regression”. Other option is “classification”

  • explainers (list, dict, optional) – A list of explainer names or, if specific parameters need to be passed to the explainers, a dict that contains the explainer names and explainer arguments. Ex. explainers = [“lime”, “shap”, “mace”] or explainers = {“lime”: {“kernel_width”: 3}, “shap”: {“nsamples”: 100}, “mace”: None}

Class constructor

getExplanationsFor(testIndex=None, feature_values=None)

Function to get the local explanations for a particular testing instance.

Parameters:
  • testIndex (Union[int, list], optional) – The indices of the testing data for which to fetch local explanations. If empty, local explanations for all instances are returned. If None, feature_values is used.

  • feature_values (dict, optional) – The values of the features corresponding to a particular index. If None, testIndex is used.

Returns dict explanations:

The requested explanations

Raises:

ValueError – If neither testIndex or feature_values is passed

Return type:

dict

getSummaryExplanations()

Returns global explanations

Returns dict explanations:

The requested global explanations

Return type:

dict

getAllExplanations()
Return type:

dict

plotExplanations(explainer=None, index=0)

Plot explanations

Parameters:
  • explainer (str) – The explainer for which to plot explanations

  • index (int) – The instance for which to plot explanations, in the numerical order returned by the explanation retrieval function. By default, 0.

class xaisuite.insightGenerator.InsightGenerator(explanations)

Bases: object

Class to generate insights based on explanation results. This was first released in version 2.0.0 and is a work in progress.

Parameters:

explanations (collections.OrderedDict) – Local explanation results, as generated by the ModelTrainer.getExplanationsFor() function

calculateExplainerSimilarity(explainer1, explainer2)

Calculates explainer similarity based on the average Shreyan distance.

Parameters:
  • explainer1 (str) – The name of the first explainer. Ex. “shap”, “lime”

  • explainer2 (str) – The name of the second explainer

Return type:

float

getShreyanDistance(vec1, vec2)

Calculate the distance between two ordered vectors. 1 is the max distance, 0 is no distance

Parameters:
  • vec1 (list) – The pattern vector

  • vec2 (list) – The disorder vector

Returns float shreyan_distance:

The Shreyan distance between the two ordered vectors

Return type:

float