XAISuite
- class xaisuite.dataHandler.XAIData(X_train, X_test, y_train, y_test)
Bases:
object
Class to represent data.
- Parameters:
X_train (numpy.ndarray) – The training features
X_test (numpy.ndarray) – The testing features
y_train (numpy.ndarray) – The training targets
y_test (numpy.ndarray) – The testing targets
- class xaisuite.dataHandler.DataLoader(data, source='auto', type='Tabular', variable_names='auto', target_names='auto', cut=None, categorical=None, **dataGenerationArgs)
Bases:
object
Class that loads data from a given source
- Parameters:
data (Union[str, Callable, numpy.ndarray, pd.DataFrame, tuple]) – The data identifier, a function that returns the data, or the data itself in the form of a numpy array, pandas DataFrame, or tuple
source (str, optional) – The source of the data. Either “auto”, “system”, “preloaded”, “generated”, or “url”. If “auto”, the source will be inferred based on data. By default, “auto”
type (str, optional) – The type of data. Either “Tabular”, “Image”, or “Text”. By default, “Tabular”. If “Text” or “Image”, only one feature is allowed.
variable_names (Union[str, list], optional) – The variables in the dataset excluding cut, in the order that they appear in the data. By default, set to “auto” and inferred.
target_names (Union[str, int], optional) – The target variable(s). By default, set to “auto” and inferred
cut (Union[str, list], optional) – Variables to drop from the data. By default, None.
categorical (Union[str, list], optional) – If type == “Tabular”, variables that contain categorical data.
**dataGenerationArgs – Additional arguments to pass in if data is Callable.
- Raises:
ValueError – if data cannot be resolved or if invalid arguments are passed.
Class constructor
- initializeDataFromSystem(id)
Initializes data from user’s system
- Parameters:
id (str) – The data file path
- Raises:
ValueError – if provided file path is not found or is not a file
- initializeDataFromPreloaded(id)
Initializes data from preloaded sklearn datasets
- Parameters:
id (str) – The name of the preloaded dataset
- Raises:
ValueError – if provided preloaded data name is not found
- initializeDataFromGenerated(id, **generateArgs)
Initializes data from string commands
- Parameters:
id (str) – The string generation command
**generateArgs – Arguments to pass to the function represented by id
- initializeDataFromUrl(id)
Initializes data from URL
- Parameters:
id (str) – The url of the data
- plot()
Plots loaded data.
- class xaisuite.dataHandler.DataProcessor(forDataLoader, test_size=0.2, processor=None, **processorArgs)
Bases:
object
Class that processes data
- Parameters:
forDataLoader (DataLoader) – The dataloader that will be associated with this processor.
test_size (float, optional) – The proportion of data that will be used to test and score the machine learning model. By default, 0.2
processor (Any, optional) – The data processer, either a string function, or an Object with fit() and transform() methods.
**processorArgs – Arguments to be passed into the processor. If the argument is a function, like a component of a composite processor, pass it in as shown in this example: DataProcessor(…, target_transform = “component: KBins(n_bins = 5)”, ratio = 0.1)
- class xaisuite.explainableModel.ModelTrainer(model, withData, taskType='Tabular', task='regression', explainers=None, **modelArgs)
Bases:
object
Class to train an explainable machine learning model.
- Parameters:
model (Any) – The string name of the model, the function returning the model, or the model Any itself. The model function must have fit() and predict() functions. A score() function is optional.
withData (DataProcessor) – The data that will be used to train and test the model
taskType (str, optional) – The type of task that the model performs. By default, “Tabular”. Other options are “Vision” and “NLP”
task (str, optional) – The task that the model performs. By default, “regression”. Other option is “classification”
explainers (list, dict, optional) – A list of explainer names or, if specific parameters need to be passed to the explainers, a dict that contains the explainer names and explainer arguments. Ex. explainers = [“lime”, “shap”, “mace”] or explainers = {“lime”: {“kernel_width”: 3}, “shap”: {“nsamples”: 100}, “mace”: None}
Class constructor
- getExplanationsFor(testIndex=None, feature_values=None)
Function to get the local explanations for a particular testing instance.
- Parameters:
testIndex (Union[int, list], optional) – The indices of the testing data for which to fetch local explanations. If empty, local explanations for all instances are returned. If None, feature_values is used.
feature_values (dict, optional) – The values of the features corresponding to a particular index. If None, testIndex is used.
- Returns dict explanations:
The requested explanations
- Raises:
ValueError – If neither testIndex or feature_values is passed
- Return type:
dict
- getSummaryExplanations()
Returns global explanations
- Returns dict explanations:
The requested global explanations
- Return type:
dict
- getAllExplanations()
- Return type:
dict
- plotExplanations(explainer=None, index=0)
Plot explanations
- Parameters:
explainer (str) – The explainer for which to plot explanations
index (int) – The instance for which to plot explanations, in the numerical order returned by the explanation retrieval function. By default, 0.
- class xaisuite.insightGenerator.InsightGenerator(explanations)
Bases:
object
Class to generate insights based on explanation results. This was first released in version 2.0.0 and is a work in progress.
- Parameters:
explanations (collections.OrderedDict) – Local explanation results, as generated by the ModelTrainer.getExplanationsFor() function
- calculateExplainerSimilarity(explainer1, explainer2)
Calculates explainer similarity based on the average Shreyan distance.
- Parameters:
explainer1 (str) – The name of the first explainer. Ex. “shap”, “lime”
explainer2 (str) – The name of the second explainer
- Return type:
float
- getShreyanDistance(vec1, vec2)
Calculate the distance between two ordered vectors. 1 is the max distance, 0 is no distance
- Parameters:
vec1 (list) – The pattern vector
vec2 (list) – The disorder vector
- Returns float shreyan_distance:
The Shreyan distance between the two ordered vectors
- Return type:
float