How to Leverage Xplainable and ChatGPT to Automate Business Intelligence & Optimisation

Explainable ML is more than just a window into a model's decision-making process — it's a blueprint for optimisation and business insights

Tim Huntley18 July 2023

featured.figrender

This article is for anyone from novice data scientists to advanced machine learning engineers who want to leverage explainable machine learning and large language models to generate automated business intelligence.

When discussing explainability in machine learning, we typically understand it to be about providing visibility over black-box models. To deliver this visibility, Data Scientists worldwide often leverage surrogate models to snapshot ML algorithm decision processes. The past six years have seen SHAP dominate this field. Following its release in 2017, SHAP rose to prominence, using game theory to produce explainability estimates that can provide insight into how any machine learning model makes predictions.

In this article, I will demonstrate an alternative to SHAP. An approach that opens the doors to real-time model explainability, business optimisation, and automated insights using a single package — xplainable. I'll also showcase how xplainable integrates seamlessly with ChatGPT to provide comprehensive explainability, and BI reports to connect data science workflows with business decision-makers.

Insights First: Changing How We Approach Tabular Machine Learning Problems

Despite the rise of deep learning (DL) and large language models (LLMs) in recent years, most business problems requiring machine learning are tabular, and in many cases, these models must be able to be explained.

Explainability is crucial in most data science projects to ensure models are robust, fair, and free of bias.

The (simplified) workflow usually goes something like this:

A person within an organisation identifies the need for a predictive model to automate a complex decision.
A data scientist develops a machine learning model to predict the decision outcome.
A good data scientist will use explainability techniques to explain the model and share the results with business leaders.
The model is deployed, and the decision is automated.
The model is monitored and retrained when required, and the project is put into maintenance mode. Such workflows do a great job of decision automation, but they leave a lot of strategic and financial potential on the table.

These models are filled with untapped business intelligence that goes unnoticed and unexploited.

Meanwhile, we have data analysts scrambling to manually uncover what is contained right there inside these models: statistical patterns and insights.

With this in mind, I would argue the process should look more like this:

A person within an organisation identifies the need for a predictive model to automate, optimise, and understand a complex decision.
A data scientist develops an explainable machine learning model to predict the decision outcome and recommend how the outcome can be optimised.
The data scientist, analysts, and business stakeholders combine their skills and knowledge to continuously collaborate, utilising the business intelligence generated by the model.
Business leaders use the model explanations to drive more informed strategies.
The model is deployed, and individual decisions are optimised. The model is actively monitored and retrained when required, and decision-making is continuously optimised by tracking changes to model explanations.

Explainable machine learning is more than just a window into a model's decision-making process — it's a way to automate problem-specific business intelligence and optimise complex decision-making.

The process I outlined is based on the following philosophy:

If you know how data informs a machine learning model, you can understand what drives your target variable and what levers can be pulled to improve a predicted outcome. If you can enhance a predicted result, you may increase the probability of improving the actual outcome.

We can categorise our features into contextual features and levers to take advantage of this idea.

Contextual Features are features that cannot be controlled but play a role in predictive decisions. Examples include State/Province, Age, Tenure, etc. Levers are features that can be controlled and therefore optimised. Examples include Contact Time, Contact Method, Support Type, etc.

When you structure a training dataset when optimisation is the objective, it is critical to understand and apply this difference. The idea is to make a prediction and then identify opportunities to set the levers to make the predicted outcome more favourable. Our purpose is not just to predict our target variable as accurately as possible but to understand what levers can be pulled on individual observations to maximise or minimise the likelihood of an event occurring.

Once we categorise our features, we can train a model to understand the relationship between our features and our target variable. We can then use global explainers to learn how our model works at a macro level and a local explainer to understand how single predictions were made for individual observations.

This is the crux of automated insights, and Xplainable excels at it. Xplainable is not a surrogate model like SHAP or LIME — it is a stand-alone machine-learning model that was purpose-built to be fully transparent and highly portable. A fitted xplainable model has global explanations built into its architecture and can offer real-time local explanations alongside predictions with predictive accuracy that matches the likes of XGBoost and LightGBM. The concept of surrogate models, in this instance, is redundant.

The Limitations of SHAP For Insights & Optimisation Tasks

SHAP is an excellent tool that explains complex models well, but it was not designed to handle real-time explanations, making it unsuitable for production-grade optimisation tasks. In fact, SHAP has several drawbacks that make extracting tangible value in real-world environments challenging:

Time Complexity

Calculating shapely values is highly compute-intensive. When working with industry-sized datasets, this time complexity can blow out to unreasonable amounts of time, making it impossible to adapt to changes in conditions on the fly. This leaves only two options: wait longer, or subsample your data making explanations less accurate.

Handling of New Data

SHAP was not designed to handle new data on the fly dynamically. Traditional methods of calculating SHAP values involve re-training the model each time new data comes in, which is computationally costly and time-consuming. Additionally, there is the issue of “cold start,” where the model struggles to provide meaningful explanations for predictions on data points that are significantly different from the training set. This restricts the utility of SHAP in real-world scenarios, where conditions may change rapidly, and the model needs to adapt and explain its predictions in near real time.

Misleading Explanations Through Feature Obscurity

Data preprocessing in machine learning pipelines often involves techniques like log transformations and one-hot encoding to improve model performance. However, these transformations can significantly obscure original features, making the explanations generated by SHAP challenging for people to comprehend. For instance, interpreting the contribution of a log-transformed feature might not provide straightforward business insights and even be misleading. The synthetic attributes created during class balancing and one-hot encoding can also influence SHAP's explanations, leading to misinterpretations.

Ease of Deployment

Deploying SHAP in a production system can be challenging. First is the computational cost: calculating SHAP values for each prediction in real-time is highly resource-intensive. This could become a bottleneck, particularly in large-scale applications where models need to make a high volume of predictions quickly. Second, the use of surrogate models can add an additional layer of complexity to the deployment process, leading to potential issues with model management, version control, and system compatibility. This is further complicated when SHAP is used with complex models. These complications can result in a considerable gap between developing and deploying models into production.

Unlocking Xplainable — A Complete Walkthrough

A Dive Into Customer Churn

To showcase xplainable, we'll walk through an example using an open-source churn dataset of a fictional Telco released by IBM. The dataset can be found here on Kaggle.

You can read more about the dataset via the link, but at a high level, it is made up of demographic, service, and account information of the Telco's customers with a binary flag indicating if the customer left (churned) within the last month.

Generally, the intention is to develop a predictive model to predict customers who will churn. But, by taking the approach I outlined earlier, we can frame the objective as follows:

Identify customers who are at high risk of churning and learn the underlying factors that drive customers to leave the business for a better retention strategy.

The topic of this post is model explainability and optimisation. So, I will skim past some of the important steps taken prior to modelling. Feel free to ask any questions about these steps over an email - [email protected].

Getting Started

For those of you comfortable with machine learning in Python, you can leverage Xplainable with a typical Pythonic API, and you can install it from PyPI with pip install xplainable, or pip install xplainable[gui] if you prefer the low-code option. This post won't cover the low-code process, but you can learn more in the xplainable docs.

The Data

Let's load the data, hold aside a test set, and take a look at the first few rows.

python

import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv('telco_customer_churn.csv')

# Create a training and test set
x, y = df.drop('target', axis=1), df['target']
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

Preprocessing

Before modelling, we must preprocess our data so it is in a suitable condition. When modelling for explainability and optimisation with xplainable, there are a few key considerations:

1. Keep the Features Explainable

Avoid applying log transformations and other preprocessing techniques that obscure data interpretability. The intention is to develop highly explainable models for improved decision-making — how can you understand and decide on, say, Monthly Charges when we are trying to interpret a log-transformed decision process?

2. Don't Balance the Classes (No SMOTE!)

This may go against what you've learnt when it comes to imbalanced datasets, but there is a good reason for it. For starters, xplainable is robust to class imbalance, at least with the right parameter set. But, more importantly, balancing classes can muddy your explanations. This is because when we interpret results, we do so relative to the average churn rate, or base value — by balancing the classes, you are hiding the true base value from the model, so it will not be able to distinguish which features are more important than others in a helpful sense and make interpretation difficult and misleading.

3. Avoid Mean Imputation

Similar to class balancing, mean imputation muddies the waters of explainability, particularly as the missing percentage increases. For categorical features, impute with the string 'missing'. For numeric features, consider dropping missing rows if there are only a few, or apply more advanced techniques to ensure bias is not introduced into the feature.

4. Condense Categorical Features

By condense, I mean reduce the number of unique categories. Highly cardinal categorical features can be challenging to interpret and often slow down training times. One approach is to choose the categories that make up x% of observations and group the rest as “other”. Xplainable offers a preprocessing stage that does this for you:

python

from xplainable.preprocessing.pipeline import XPipeline
import xplainable.preprocessing.transformers as xtf

X_train['City'].nunique()
# Result: 1127

pipeline = XPipeline()

stages = [
    # <-- your other stages here, 
    {"feature": "City", "transformer": xtf.Condense(pct=0.5)}
]

pipeline.add_stages(stages)

X_train_transformed = pipeline.fit_transform(X_train)

X_train_transformed['City'].nunique()
# Result: 256

Feature Selection

When it comes to modeling for explainability and insights, feature selection is one of the most critical steps. We must aim to maximize our accuracy metrics while maintaining the integrity of our explanations.

The two sins of feature selection in explainable modeling are:

1. Multicollinearity

Features that are too closely related can hurt your model explanations. You MUST deal with multicollinearity if you want to get the most out of your features. Multicollinearity won't always affect the predictive performance per se, but it will hurt model interpretability, especially if you intend on using model explanations for decision-making.

2. Too Many Features

Xplainable can handle many features. The human brain, however, cannot. To get the most out of explainable modeling, you should select only the most important features — this will ensure you can maintain a high degree of accuracy while keeping things simple and understandable so decision-making becomes easier.

To deal with both such cases, xplainable has built-in feature selection classes that shortlist your features and explain why certain features were dropped. Here's how it looks for our Telco example:

GraphSelector

This feature selector aims to deal with multicollinearity by iteratively identifying the most highly-correlated features and dropping the weakest one. GraphSelector creates a correlation matrix and cross-maps all the features together using NetworkX to identify which features are problematic. All of this can be achieved with just a few lines of code:

python

Copy
from xplainable.feature_selection.graph import GraphSelector

# Fit the network
graph_selector = GraphSelector(min_feature_corr=0.5, min_target_corr=0.01)
graph_selector.fit(X_train, y_train, start_threshold=0.75)

# Access the remaining features
graph_selector.selected

To visualise how GraphSelector chose our features, we can run the following line:

python
graph_selector.plot_graph()

You can see in the initial state a group of features that are highly correlated with one another. As we hit 'play', we can see each iteration at work dropping out the most troublesome feature each time.

To explain the decision process, we can run the following:

python

for feature in graph_selector.dropped:
    print(f"Dropped {feature['feature']} because of {feature['reason']}")

You can visually see the decision process of the feature selector and, using our intuition, confirm that it is removing features that are highly likely to be causing multicollinearity. This helps us understand our data better and allows us to set the right parameters for the feature selector to ensure no key features are dropped.

XClfFeatureSelector

This feature selector finds the most predictive features for xplainable classification models (XClassifier). It does this by iteratively selecting feature samples and training a model for each feature set. Following training, the feature importance scores are multiplied by the specified accuracy metric and stored. The process repeats until all iterations have finished, and at the end, the features are ranked from most important to least important based on their total cumulative scores. This makes it simpler to identify which features to drop.

The following is the output of running XClfFeatureSelector on the Telco dataset (before running GraphSelector):

python

fs = XClfFeatureSelector(n_samples=100)
feature_info = fs.fit(X_train, y_train)

feature_importance_gif

The resulting chart shows the summation of each feature's importance multiplied by the AUC (default metric) across all iterations. You can see that Gender did not come into play in any of the iterations, so it should be dropped, as should Phone Service and Multiple Lines. On the other hand, Tenure Months, Monthly Charges, and Total Charges each played a highly-influential role each time they were selected. The other features also weigh into model decision-making to varying degrees.

Even without delving into detailed explainers, we know that Monthly Charges, Total Charges, and Tenure Months are likely to be heavily related. This may introduce multicollinearity into the model and should be handled appropriately using GraphSelector or other methods.

Quote Feature selection is most powerful when these approaches are combined. Each problem is different, so you can experiment with the order in which you run them and the parameters you set for each to see what yeilds the best results. I'll be publishing another post down the track that covers this topic in more detail.

Modelling

Now, for the really interesting stuff. Modelling with xplainable is no different to most machine learning libraries in Python — a simple fit/transform combo is all you need to apply. Parameter optimisation in xplainable, however, is a little different — it's incredibly easy and lightning-fast (see rapid refitting).

python


from xplainable.core.models import XClassifier
from xplainable.core.optimisation.bayesian import XParamOptimiser

# Optimise hyperparameters
opt = XParamOptimiser(metric='roc-auc')
params = opt.optimise(X_train, y_train)

# Train model
model = XClassifier(**params)
model.fit(X_train, y_train)

That's it. Your model is trained, and your insights are available for analysis and optimisation.

Fine Tuning

If you want to extract more predictive power out of xplainable models, they can be fine-tuned by setting parameters at the feature level rather than the dataset level. This can be done by specifying a new set of parameters for one or more features and applying the following:

python

params = {
  "max_depth": 7,
  "min_info_gain": 0.03,
  # ...
}

model.update_feature_params(features=['Tenure Months'], **params)

This method of parameter refitting can be thousands of times faster than retraining the model from scratch — this is what enables rapid parameter optimisation and feature selection.

Evaluation

To evaluate a model, you can run this single line that outputs most of the key metrics relevant to the model type. In this case, classification:

python

model.evaluate(X_test, y_test)

Explaining the Model

To explain the Telco Churn model and extract its global-level insights, you can simply run:

python

model.explain()

Quote Note that if you want to render the interactive chart above, you'll need to install the correct dependencies with pip install xplainable[plotting].

As you can see, global explanations are instantly available because the metadata is computed and stored during training. This allows us to understand the model decision-making process without using a surrogate model and makes the portability of models with explanations infinitely easier.

In fact, the metadata is so light that xplainable models can be deployed to predict in-browser, without a backend, and even on microprocessors (think Tensorflow lite with real-time explainability). This will be covered in another blog post.

Local Explanations can be run on one or more observations and can be calculated immediately after model training. This can be done either on observations from a dataframe or in the scenario analysis tool.

Explain Observations as a DataFrame

python

local_explanations = model.predict_explain(X_test)
local_explanations.sample(5)

The returned dataframe contains the contribution of each feature towards the prediction, along with the base_value, score, probability, multiplier, and support.

Explain Observations as a Waterfall Plot

python

model.local_explainer(X_test, subsample=15)

The local explainer provides the same information as the predict_explain() method but allows us to visualise each observation nicely, one at a time.

Create Scenarios

When you want to simulate future events and understand the impact of each feature, you can utilise the scenario analysis tool. Here is what it looks like with the Telco Churn data:

python

from xplainable.gui.screens.scenario import ScenarioClassification

scenario = ScenarioClassification(model)
scenario.run()

This screen allows you to run scenarios quickly and understand how predicted event outcomes are derived.

GPT Explainer

When you have a valid xplainable API key or access to the xplainable web app, you can generate automated, comprehensive text insights and explanations of the model using ChatGPT.

Here is an example of a report generated on the Telco Churn dataset:

python

import xplainable as xp
from xplainable.gpt import gpt_explainer
from IPython.display import Markdown

# This detects any pre-definined xplainable API keys
xp.initialise()

# Generate a report
report = gpt_explainer(
    model_id="",
    version_id="",
    target_description="If the customer churned or not",
    other_details="Identify customers who are at high risk of churning and learn the underlying factors that drive customers to leave the business for a better retention strategy."
)

# Display markdown
Markdown(report)

Xplainable is able to integrate with ChatGPT seamlessly, and honestly, the analyses are pretty impressive. Here's what we're able to extract from it:

A summary of the problem and data
A breakdown of the feature importances
A detailed analysis of each key feature and the driving factors of each
Key patterns and insights in the data
Bias detection and analysis
Conclusion and recommendations This is a game changer when it comes to automated insights and model explainability and is enabled by the light architecture of xplainable models.

API Deployments With Embedded Explainers

Deploying xplainable models is easy. Once a model has been trained, you can spin up an endpoint that handles prediction and explanations in less than a second. All you have to do is have an initialised API key and run the following:

python

xp.client.deploy(
  hostname="your-xplainable-host",
  model_id="model_id",
  version_id="version_id",
  partition_id="partition_id"
)

In under a second, this model will be deployed on an Xplainable Cloud server and ready for inference — you can manage the deployment security and access keys via the xplainable web app.

Here is a real test of the deployed model using Postman:

Notice how the response time is less than 50ms, even with a prediction breakdown included? This is what opens the doors to real-time optimisation — something unachievable with surrogate models like SHAP.

Optimisation with xplainable is an enormous topic and one that deserves its own blog post. It's on the list, so stay tuned for more on real-time optimisation.

Key Takeaways

Explainable machine learning is more than just explaining models — it's about extracting meaningful insights and understanding the whole data science lifecycle from data preprocessing to model deployment to drive better decision-making.

Xplainable offers a means of achieving both data science and business intelligence objectives in one workflow that bridges the gap between data professionals and business decision-makers.

If you've come this far — thank you.

How to Support Xplainable

Star us on GitHub.
Follow us on Linkedin.

Our Commitment to Open Source — Calling on Contributors

We actively encourage collaboration and contribution from the open-source community. We believe that diverse perspectives and collective expertise foster high-quality software development. By making Xplainable open source, we aim to build a vibrant community of users, contributors, and enthusiasts who can collectively shape the future of explainable machine learning.