Uncovering the Power of Explainable Machine Learning in Natural Language Processing

Discover how explainable machine learning can help you identify and filter out spam sms messages with greater transparency and understanding.

Jamie Tuppack25 Sept 2023

featured.figrender

OVERVIEW

Spam classification is a common application of natural language processing (NLP), with the goal of identifying and filtering out unwanted or spam sms messages from a user's inbox. Traditional machine learning methods have been successful in achieving high levels of accuracy in spam classification, but often lack transparency and interpretability, making it difficult to understand how and why the model made a particular decision.

Explainable machine learning is an increasingly important field within NLP that aims to address this issue by providing more transparent and interpretable models. In this article, we will explore the importance of explainable machine learning in spam classification, and demonstrate its benefits through a case study on an open-source dataset. We will the xplainable natural language capability to classify sms messages as spam or not spam, and understand the factors that contribute to a model's decision.

THE DATA

The SMS Spam Collection is a dataset of SMS messages that has been compiled for research on SMS spam. It includes a set of 5,574 SMS messages in English, each labeled as either legitimate or "spam." You can find a sample of the raw data below:

TEXT	TARGET
Ok i've sent u da latest version of da project.	not spam
K I'm leaving soon, be there a little after 9	not spam
This is wishing you a great day. Moji told me about your offer and as always i was speechless. You offer so easily to go to great lengths on my behalf and its stunning. My exam is next friday. After that i will keep in touch more. Sorry.	not spam
Hey!!! I almost forgot ... Happy B-day babe ! I love ya!!	not spam
For real tho this sucks. I can't even cook my whole electricity is out. And I'm hungry.	not spam
Text PASS to 69669 to collect your polyphonic ringtones. Normal gprs charges apply only. Enjoy your tones	spam
Free entry to the gr8prizes wkly comp 4 a chance to win the latest Nokia 8800, PSP or å£250 cash every wk.TXT GREAT to 80878 http//www.gr8prizes.com 08715705022	spam
Hi im having the most relaxing time ever! we have to get up at 7am every day! was the party good the other night? I get home tomorrow at 5ish.	not spam
Please CALL 08712402578 immediately as there is an urgent message waiting for you	spam
Hmmm... Guess we can go 4 kb n power yoga... Haha, dunno we can tahan power yoga anot... Thk got lo oso, forgot liao...	not spam
---	---

It includes 425 manually extracted spam messages from the Grumbletext website, a UK forum where users share experiences of receiving SMS spam. The dataset also includes 3,375 randomly chosen legitimate messages from the NUS SMS Corpus, a collection of around 10,000 messages gathered by the National University of Singapore, and 450 legitimate messages from a PhD thesis. Additionally, the dataset includes 1,002 legitimate and 322 spam messages from the SMS Spam Corpus v.0.1 Big, which has been used in academic research. The original dataset can be found here.

FEATURE EXTRACTION

Feature extraction is a crucial step in natural language processing (NLP) that involves identifying and extracting relevant information from raw text data. This process allows NLP models to more effectively analyze and interpret the text, leading to improved performance and accuracy. The feature extraction methods chosen for the spam dataset were:

The count of characters

This method involves counting the total number of characters in a document or text. This could be useful for tasks such as identifying short versus long texts, or analyzing the readability of a text.

The count of words

This method involves returning the count of words in a text. Counting words can be useful for measuring the length of a text, as well as identifying the presence of specific words or phrases in a text. When it comes to spam detection, counting words can also be a valuable feature to identify spams. Spammers tend to use shorter texts, as they want to grab the recipient's attention and make them act quickly without reading the whole message.

The sum of positive contributing words

This method involves scoring words by their association with spam, mapping each word in the document or text to their score, and summing the total number of scores when the score is greater than 0.

The sum of negatively contributing words This method involves scoring words by their association with spam, mapping each word in the document or text to their score, and summing the total number of scores when the score is less than 0.

The count of upper case words

This method involves returning the count of uppercase words in a text. Proper nouns, proper names or emphasis in a text are usually written in uppercase letters. A higher number of uppercase words in a text may indicate that the text is more formal or more important. Additionally, counting uppercase words can also be useful for identifying proper nouns and proper names in a text.

The count of numbers

This method involves returning the count of numbers in a text. Numbers in text can indicate the presence of numerical values, measurements or quantities. However, when it comes to spam detection, counting numbers can also be a valuable feature to identify spams as spammers tend to use phone numbers, money figures, and other numerical values to make the message more exciting or important to grab the attention of the recipients.

The count of punctuation

This method involves returning the count of punctuations in a text. Spammers tend to use more punctuation characters than regular users, the reason behind this is that they want to grab the attention of the recipient by making the text look more exciting or important. So, a higher number of punctuations in a text may indicate that it is a spam message.

The average word length

This method involves returning the average word length in a text. Average word length can be used to measure the complexity of a text, as a higher average word length generally indicates a more complex text. Additionally, this metric can be useful for identifying the readability of a text, as shorter words are generally easier to read and understand. When it comes to spam detection, average word length can also be a valuable feature to identify spams. Spammers tend to use shorter words and simple language to make the message more appealing to a wider audience and to increase the chances that recipients will open and read the message. A lower average word length in a text may indicate that the text is more likely to be a spam message. By including this feature in a machine learning model, it can help to improve the accuracy of identifying spam messages.

Each feature was automatically extracted using xplainable's NLP preprocessor.

THE MODEL

xplainable combines the most modern and sophisticated machine learning techniques to produce an entirely transparent model without needing external methods such as shapely values. This novel approach allows us to understand the critical textual features of spam and ascertain accurate probabilities of the text being correctly classified as spam.

We can assess the model's effectiveness by calculating several metrics: accuracy, precision, recall, and the F1 score. In short, these are measures of understanding the balance of true positives, true negatives, false positives, and false negatives. If these metrics don't resonate, you can use the info icons below to understand their calculations.

To optimize the model for our objective, we can adjust the balance between precision and recall by choosing a decision threshold on the x-axis of the score probability plot. This allows us to understand how many false positives and false negatives we can expect at different thresholds and choose the threshold that best meets our needs.

In the context of spam classification, precision is generally more important than recall. This is because it is more important to minimize the number of legitimate sms messages (or emails) that are mistakenly marked as spam (false positives) than it is to maximize the number of spam sms messages that are correctly identified (true positives). Blocking valid SMS messages mistakenly can result in adverse outcomes, but not identifying certain spam SMS messages is not as crucial, as the majority of users can effortlessly find and remove spam SMS messages from their inbox.

Therefore, we can set the decision threshold to maximize precision and minimize the number of false positives. This can help to ensure that important messages are not blocked and that the model is able to accurately predict spam sms messages.

Bins

True Negative

False Positive

False Negative

True Positive

INTERPRETING THE DECISION THRESHOLD

The score (x-axis) references a number between 0-100 and represents a proxy probability of the positive class occurring, or in this case, the text being classified as spam. The score then maps to an actual probability (y-axis) calculated by binning the scores and observing the real spam rate within the training data at each bin. The Y-axis is a better representation of the expected probability. The line you can see going diagonally from the bottom-left to the top-right shows where the score maps to a probability. The coloured bars represent the actual count of spam and not spam count in the training data.

KEY INFLUENCERS

One of the most significant benefits of using an xplainable model is the insights it produces - in this case, it allows us to understand the key drivers of spam without spending considerable time wrangling data and generating plots.

Following the automated model training process, xplainable produced a pair of cross-filtering charts to help us visualise the model profile and understand how different text features contribute to the likelihood of spam:

nlp_uppercase_words

The main driver for a text to be classified as spam is the summation of "positive contributing words," which are words that are typically associated with spam (claim, call free etc.). This factor is closely followed by the total count of numbers within the text, which is only marginally (~2%) less influential. The number of characters and the count of punctuations in the text also play a large role, where the greater value of each, the more likely text is to be classified as spam. Interestingly, there is a "maximum" length for spam texts at 24.5-26.5 words before the spam likelihood decreases. Also the "average word length" is counter-intuitive suggesting that texts with longer words tend to be more indicative of spam. One of the main drivers for this could be the inclusion of long strings of numbers followed by exclamation marks.

Overall, a higher number and total score of positive contributing words increases the likelihood of a text being classified as spam, while a higher number of negative contributing words decreases the likelihood.

The findings described in the text match what one might expect from a spam message. Spam messages frequently include language that is urgent or insistent in nature, and they may try to sell a product or service by using exaggerated or false claims. They may also try to pressure the recipient into taking an action, such as clicking a link or providing personal information.

These are great insights for determining the general structure of spam messages, but what if we want to investigate the likelihood on a case-by-case basis? The xplainable natural language processing (nlp) scenario analysis tool helps us achieve this:

If you have read any of the other walkthroughs you'll immediately notice this is different to previous examples.

To use this feature, type a message into the text input field. The "score" and "probability" will be displayed to show the likelihood of the message being spam. If the score is over 0.5, the message will be classified as spam. To see a breakdown of the individual contributing words and the overall sentence structure, hover over the segments of the bar chart.

Probability

Score

A possible real world use case of the scenario tool would be a browser extension that uses the xplainable machine learning algorithm to distinguish spam messages from legitimate messages. Not only would it accurately distinguish spam vs non-spam it's also possible to provide a rationale or explanation for its predictions, which is paramount in understanding how it arrived at a particular conclusion.

FURTHER USE CASES

In this section, we will delve deeper and examine some further use cases where xplainable can make a significant impact. Whether it's improving customer service, enhancing machine translation, or streamlining document classification, explainable machine learning has the potential to drive significant improvements across a wide range of NLP applications.

Sentiment analysis

xplainable can be used to identify the sentiment of a piece of text, such as a customer review or social media post. These models can provide explanations for why they classified a particular text as positive, negative, or neutral, which can be useful for understanding customer sentiment and making business decisions.

Chatbots

xplainable can be used to power chatbots and provide explanations for their responses to user queries. This can improve the trustworthiness and transparency of chatbots, and make them more useful for tasks like customer service.

Document Classification

xplainable can be used to classify documents, such as legal documents or medical records, into different categories. xplainable can provide explanations for how they arrived at their classification, which can be useful for understanding the content of a document and making decisions based on it.

Question answering

xplainable can be used to answer questions posed in natural language, and provide explanations for how they arrived at their answers. This can be useful for tasks like customer support or knowledge base management, where it is important to understand the basis for a model's answer.

Work Order Classification

xplainable NLP can extract a variety of features from work orders to aid in the classification process. These features include the type of work, location, urgency, equipment, skill level required, due date, priority, and departments. For example, the model can extract information such as "maintenance," "repair," or "installation" from the text of the work order to understand the nature of the work that needs to be done. The model can also extract information such as the specific equipment or machinery that the work order pertains to, specific skills or certifications required to complete the work, due date, priority, and department responsible for completing the work order.

In conclusion, explainable machine learning is an important field within natural language processing (NLP) that aims to provide more transparent and interpretable models that can better explain their decision-making process. Whether you're looking to understand customer sentiment, improve the accuracy of your spam filter, or enhance the performance of your chatbot, xplainable can provide valuable insights and help you make more informed, trustworthy decisions.

Uncovering the Power of Explainable Machine Learning in Natural Language Processing

OVERVIEW

THE DATA

FEATURE EXTRACTION

THE MODEL

INTERPRETING THE DECISION THRESHOLD

KEY INFLUENCERS

FURTHER USE CASES

Stop guessing. Start deciding.

How Xplainable meets all three regulators at once

Xplainable Docs v2: SDK, REST API, MCP, and AI-Powered Search

Explainable AI in Banking: Why Every Automated Decision Will Need a Reason by December 2026