统筹推进重点突破更好发挥改革牵引作用

0
32
百度 鉴于目前《中华人民共和国出境入境管理法》的实施细则尚未制定,关于“出国定居”的法定内涵尚不明确具体,因此,现阶段上海公安机关对出国定居人员不注销户口。

Explainable AI (XAI) makes AI systems transparent, explainable, and credible. It ensures users understand why a given model arrived at a specific decision so that it is easier to verify, debug, and trust the system. Let’s look at all that XAI entails followed by a real-world example of how it works.

Agentic AI represents the next evolution of intelligent systems—digital entities that not only follow instructions but also set goals, make decisions, and learn independently. These agents will soon encompass every aspect of human life, from finance and energy to robotics and personalised courses. But with this exciting potential comes a critical challenge: understanding what these agents are thinking. Agentic AI does not simply follow conventional systems but engages in self-directed behaviour influenced by constantly changing goals, interaction with the environment, and acquired knowledge, thereby giving rise to some very complicated and often opaque processes behind decisions. An example: A totally autonomous vehicle may suddenly make an unexpected manoeuvre. Or a healthcare agent may recommend some treatment that is not of common standard protocol. Certainly, knowing what an AI agent has done is insufficient. We want to know why it did so.

But why is this understanding so vital?

First, we are more likely to trust technologies we understand. If agentic systems are black boxes making decisions without clear rationale, it will lead to erosion of public trust and subsequently hinder the spread of such systems, and the benefits derived from them. Would you trust a rationally opaque entity to control your health or finances?

Second, it is something that must be done. Since agents are going to take on new roles, accountability is primary. If an autonomous agent inflicts damage or makes a blunder, who comes under fire when there is no knowledge of the system’s mental processes?

It’s also important to know why an agent made a certain decision so that its logic may be debugged and risks mitigated to ensure it operates safely and reliably.

Finally, we need this understanding for further growth and improvement. Knowledge of the strengths and weaknesses of an agent’s reasoning enables algorithm refinement and helps build better systems. We learn from their successes and failures.

As intelligent agents take on greater roles in society, understanding their decision-making is no longer optional—it’s essential for trust, accountability, and safety. To fully harness their potential, we must ensure their actions are transparent and explainable. This is the critical role of explainable AI (XAI), which enables us to interpret and trust these increasingly autonomous systems.

By giving results without informing how it gets them, AI, especially deep learning, often works as a ‘black box’. Explainable AI is designed to fulfil the goal of rendering AI decisions evident and understandable to humans — it helps understand why an AI system came to a particular conclusion.

The methodologies explainable AI uses are briefly outlined below.

Intrinsic explainability (white-box models)

This approach focuses on using inherently interpretable model architectures. These ‘white-box’ models are designed from the ground up to be transparent in their decision-making process. Examples include:

  • Decision trees: These models represent decisions as a series of hierarchical rules, making the path to a prediction easily traceable.
  • Linear regression: The relationship between input features and the output is clearly defined by coefficients, indicating the direction and magnitude of each feature’s influence.
  • Rule-based systems: Decisions are made based on a set of explicit ‘if-then’ rules that are directly understandable by humans.

While these models offer high interpretability, they often come with limitations in terms of the complexity of the patterns they can learn and their predictive power compared to black-box models.

Post-hoc explainability (black-box models)

These methods illuminate ‘the black box’ without modifying its internal workings. Some important post-hoc methods include:

  • Feature importance: These methods identify which input features had the most significant influence on a model’s prediction. Techniques like SHAP (SHapley Additive exPlanations) and permutation importance fall into this category.
  • Saliency maps: Primarily used in computer vision, these techniques highlight the regions in an input image that were most important for the model’s classification decision.
  • Local Interpretable Model-agnostic Explanations (LIME): LIME approximates the decision boundary of a complex model locally around a specific instance by training a simpler, interpretable model on perturbed versions of that instance.
  • Counterfactual explanations: These explanations identify the minimal changes to an input that would lead to a different prediction, helping users understand ‘what if’ scenarios.
  • Attention mechanisms: In models like transformers used in natural language processing, attention weights can provide insights into which parts of the input sequence the model focused on when making a prediction.

Selecting the appropriate explainability technique depends on the specific use case and data modality. Table 1 maps common use cases to recommended XAI methods along with their rationale.

Table 1: Mapping common use cases to recommended XAI methods

Use case

Recommended technique

Why?

Global
interpretability

SHAP (SHapley Additive exPlanations)

Offers additive, consistent global explanations

Local
interpretability

LIME (Local Interpretable Model-agnostic Explanations)

Explains individual predictions using local linearity

Image feature attribution

Saliency maps

Highlights important pixels influencing decision

Tabular data with correlated features

SHAP

Handles feature interaction better than LIME

 

Table 2: Comparison of intrinsic vs post-hoc explainability

Aspect

Intrinsic
explainability

Post-hoc
explainability

Definition

Built into the model architecture itself

Interpretation applied after training

Model type

Transparent (e.g., decision trees, linear models)

Black-box (e.g., neural networks, ensemble models)

Interpretability level

High, as the logic is human-understandable

Varies, depends on technique used

Performance trade-off

May sacrifice performance for interpretability

Typically maintains model performance

Example

Logistic regression showing feature coefficients

SHAP explaining a CNN’s image classification

 

XAI promotes ethical use and responsibility in AI applications. As agentic AI grows, XAI will become even more important. Since these systems make decisions based on complex learning and interactions, understanding their reasoning is crucial. The key objectives of XAI are shown in Figure 1.

Figure 1: Key objectives of XAI
Figure 2: XAI case studies

The challenge of explaining agent autonomy

The shift from traditional AI to agentic AI adds complexity to explainability.

Autonomous agents are computer programs that can make decisions and act on their own without overt human intervention. With more and more sophisticated and pervasive agents—self-driving vehicles, trading algorithms, and medical diagnoses—the need to explain and understand their actions becomes critical. The key challenges are:

  • Opaque decision logic: Most autonomous systems are powered by complex models like deep neural networks, which are ‘black boxes’. It is often extremely difficult to explain why a specific decision was made.
  • Dynamic environments: Autonomous agents routinely interact with dynamic and unstable environments, so explanations must be more context-dependent and less generalizable.
  • Real-time constraints: Decisions must be made in milliseconds. Generating explanations that are both helpful and in real-time adds immense additional complexity.
  • Multi-agent interactions: In systems with large numbers of interacting agents (e.g., swarm robotics or AI for games), an agent’s action may depend on others’ intentions as well.
  • User trust and accountability: Without explanations, users do not trust and developers cannot verify behaviour or meet regulations.

Methods such as SHAP, LIME, counterfactual reasoning, and rule extraction assist in solving the puzzle, but complete transparency in agentic AI is still an active research frontier.

The importance of explainable AI for agentic systems becomes even clearer when we examine real-world applications and potential case studies. The case studies in Figure 2 illustrate the application of XAI in a wide range of real-world scenarios. These examples illustrate how XAI techniques can provide crucial insights, build trust, and address ethical concerns in various domains where autonomous agents are being deployed or are on the horizon.

Understanding ‘why’ an agent behaves is not only a theoretical benefit, but a practical necessity in order to achieve the full potential of agentic AI while reducing its dangers.

XAI in code: A developer’s hands-on approach

Let’s now look at how XAI can be applied in early sepsis detection using Python libraries.

Sepsis is a life-threatening medical condition affecting over 1.7 million adults in the US annually, causing more than 270,000 deaths. Early detection is vital but challenging due to its subtle symptoms. AI and machine learning can identify sepsis risk early, but clinical adoption depends on trust and understanding of model predictions. Explainable AI (XAI), particularly using SHAP (SHapley Additive exPlanations), enables doctors to interpret these predictions.

Explainability is crucial in sepsis prediction because even though AI models trained on electronic health records can outperform traditional alerts, clinicians may not trust or act on alerts they don’t understand. To gain their confidence, it’s essential to clarify which features contributed to a high sepsis risk, ensure the insights are clinically relevant, and show how reliable the predictions are based on a patient’s current and historical vitals.

There are several open source libraries for the interpretation and explanation of machine learning models, including LIME, ELI5, Anchor Explanations, InterpretML, DALEX, AIX360, and PyCaret, to name a few. Each of these libraries has something distinct and different to offer, from local model interpretability to high-level visualisation and debugging capabilities. But explaining all these tools is out of the domain of this article. For the sake of simplicity and ease of explanation, we will focus on demonstrating a straightforward example with the help of SHAP.

SHAP is an open source Python library for model interpretability, based on the concept of Shapley values from cooperative game theory. It explains the output of any machine learning model by computing the contribution of each feature to a specific prediction. SHAP was developed by Scott Lundberg, a researcher at Microsoft Research, and the methodology was introduced in a paper he wrote in 2017. Here’s the Python program:

!pip install shap xgboost pandas matplotlib scikit-learn --quiet

import shap

import xgboost

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

# Optional visual theme

plt.style.use(‘seaborn-whitegrid’)

# ------------------------

# Simulated Dataset

# ------------------------

np.random.seed(42)

n = 1000

X = pd.DataFrame({

‘heart_rate’: np.random.normal(90, 15, n),

‘resp_rate’: np.random.normal(22, 5, n),

‘temp’: np.random.normal(98.6, 1, n),

‘wbc’: np.random.normal(11, 4, n),

‘lactate’: np.random.normal(1.5, 0.8, n),

‘age’: np.random.normal(65, 20, n),

})

# Target Label Logic

X[‘sepsis_risk’] = (

(X[‘heart_rate’] > 100).astype(int) +

(X[‘resp_rate’] > 24).astype(int) +

(X[‘lactate’] > 2.5).astype(int) +

(X[‘wbc’] > 12).astype(int)

)

y = (X[‘sepsis_risk’] >= 2).astype(int)

X.drop(columns=’sepsis_risk’, inplace=True)

# ------------------------

# Train-Test Split

# ------------------------

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = xgboost.XGBClassifier(use_label_encoder=False, eval_metric=’logloss’)

model.fit(X_train, y_train)

# Accuracy

preds = model.predict(X_test)

print(f”Test Accuracy: {accuracy_score(y_test, preds):.2f}”)

# ------------------------

# Show Dataset Samples

print(“\nPreview of Dataset Features:”)

print(X.head())

print(“\nFirst Test Sample (used for SHAP waterfall/force):”)

print(X_test.iloc[0])

# SHAP Explainability

explainer = shap.Explainer(model, X_train)

shap_values = explainer(X_test)

# 1. Waterfall plot (local)

plt.figure(figsize=(10, 5))

print(“\n[1] Waterfall Plot: Local SHAP Explanation for First Instance”)

shap.plots.waterfall(shap_values[0], show=False)

plt.title(“Waterfall Plot: SHAP Value Breakdown for First Prediction”, fontsize=14)

plt.tight_layout()

plt.show()

# 2. Beeswarm plot (global)

plt.figure(figsize=(10, 6))

print(“[2] Beeswarm Plot: Global SHAP Feature Importance”)

shap.plots.beeswarm(shap_values, show=False)

plt.title(“Beeswarm Plot: Global Feature Impact”, fontsize=14)

plt.xlabel(“SHAP Value (Effect on Prediction Output)”, fontsize=12)

plt.ylabel(“Features”, fontsize=12)

plt.tight_layout()

plt.show()

# 3. Force plot (local)

print(“[3] Force Plot: Push and Pull Effect for First Instance”)

shap.plots.force(shap_values[0], matplotlib=True)

Figure 3 shows the program output.

Program output
Figure 3: Program output
SHAP waterfall plot
Figure 4: SHAP waterfall plot

Explainable AI with waterfall, beeswarm, and force plots

An XGBoost model is trained to predict whether a patient is at high risk of sepsis based on clinical vitals (heart rate, respiration, WBC count, etc). Since decisions in healthcare have high stakes, explainability is not optional — it is critical. SHAP (SHapley Additive exPlanations) provides transparent and mathematically grounded explanations that help:

  • Understand individual predictions (local interpretability)
  • Understand global model behaviour (global interpretability)

Waterfall plot: The SHAP waterfall plot gives a local explanation for the prediction for one individual patient. It visually breaks down how each feature pushed the prediction up or down from the model’s baseline.

Let’s dive into this plot in the context of explainable AI in healthcare (sepsis prediction).

  • The model starts with a baseline prediction (E[f(x)]) of –3.734, which is the average prediction across all patients.
  • Then, each feature’s SHAP value adds or subtracts from this base to reach the final prediction f(x) = –4.633.
  • Arrows show positive (red) or negative (blue) influence on the prediction.

The strong negative influence of WBC and heart rate outweigh the positive influence of respiratory rate, leading to an overall low risk of sepsis (–4.633).

Table 3: XAI significance of waterfall plot

XAI aspect

Explanation with waterfall plot

Transparency

We see exactly which features drove the decision and in which direction.

Local
interpretability

Tailored to this specific patient — not a general trend.

Consistency

Matches domain knowledge (e.g., very low WBC decreases risk).

Trust

Clinicians can understand the rationale — increasing confidence in model output.

Auditability

Useful in regulated environments to defend decisions on individual predictions.

 

Beeswarm plot
Figure 5: Beeswarm plot

Beeswarm plot: Now let us see what the beeswarm plot is showing.

  • Each dot = one patient in the dataset.
  • Y-axis = features used by the model.
  • X-axis = SHAP value (how much a feature pushed prediction up or down).

This plot provides a global view of feature importance across all patients in the test set. Each dot = one patient.

Feature

Role

SHAP trend

resp_rate

Most important overall

High values higher sepsis risk

wbc

Strongly bi-directional

Low values lower risk

heart_rate

Important signal

High values higher risk

lactate

Moderate influence

Non-linear behaviour

temp, age

Minimal contribution

Weak or neutral

 

Feature

Insight

resp_rate

Most important feature. High values consistently push predictions strongly up higher sepsis risk.

wbc

Bidirectional impact. High WBC pushes risk up (infection likely). Low WBC pushes risk down.

heart_rate

Important signal. High heart rate = elevated sepsis risk.

lactate

Moderate influence. Shows non-linear behaviour—some high values impact prediction strongly, others don’t.

temp

Weak impact. Most dots cluster around 0 SHAP value.

age

Minimal contribution overall. Likely not a key predictor in this model for sepsis.

 

Force plot
Figure 6: Force plot

Force plot: The force plot visualises how different features (input variables) ‘push’ a model’s prediction away from the base value (the average model prediction across the training data).

  • Base value (grey line): The average model output across all patients. In this case, it’s around –3.73.
  • f(x) (black text): The final model prediction for this individual = –4.63 (This means the model predicts low sepsis risk).
  • Red sections (left push): Features that increase the predicted value (push toward higher sepsis risk).
    resp_rate = 24.91 → +4.2 SHAP value → strongly pushed toward higher risk.
  • Blue sections (right push): Features that decrease the predicted value (push toward lower risk).
  • wbc = 1.03 → –3.3 SHAP value
  • heart_rate = 98.15 → –1.56 SHAP value
  • What this means clinically is that:
  • High resp rate (red dots on the right)
    pushes risk up.
  • Low WBC (blue dots on the left) pushes
    risk down.
  • The patient had a high respiration rate, which is typically a red flag for sepsis.
  • However, very low white blood cell (WBC) count and moderate heart rate provided strong
    counter-evidence.
  • Net effect: Despite a major risk signal (resp_rate), the stronger negative signals led the model to predict low risk of sepsis.

The force plot gives an interactive summary of how individual features push the prediction away from the base value. Here, features like wbc and heart_rate pulled the model away from predicting sepsis, despite resp_rate pushing in the other direction.

So the final impression is:

  • Only one feature (resp_rate) increased the predicted risk,
  • But strong negative evidence (low WBC) led the model to conclude low likelihood of sepsis.

The emergence of agentic AI, with the capability to pursue goals, interact with environments, and learn through experience, is a significant revolution in artificial intelligence. Yet, the sophistication of such systems generates trust, accountability, safety, and ethical problems. Explainable AI (XAI) helps overcome these challenges by making the decision-making process of agents transparent, ensuring correct behaviour, determining errors, and promoting trust and cooperation.

Table 4: How XAI helped explain the result

XAI criteria

What SHAP output shows

Transparency

Doctor can see exactly what raised/lowered risk

Actionability

WBC is very low — probably not sepsis

Trust

Explains decision logic — clinicians more likely to trust model

Auditable

Can show regulators why a patient was (or wasn’t) flagged

Bias detection

Beeswarm shows if age, etc, is unfairly weighted

 

The particularity of agentic systems—such as dynamic behaviour, goal-orientedness, and learning—requires XAI techniques tailored to such systems. Traditional XAI methods are evolving to accommodate these intricacies, such as enhanced temporal action explanation, goal-oriented behaviour explanation, and visualisation of internal states. Increased adoption of XAI in real-world applications, ranging from autonomous cars to healthcare and industry, underscores its significance. Yet, more research is necessary to enhance XAI methods, particularly in managing complex behaviours and adapting explanations to various users.

Ultimately, whether agentic AI will succeed or not rests with our capacity to make these systems explainable. By making explainability a priority, we can make these agents trustworthy, ethical, and aligned with human values, setting the stage for their use in society.


Disclaimer: The insights and perspectives shared in this article represent the authors independent views and interpretations and not those of their organization or employer.

LEAVE A REPLY

Please enter your comment!
Please enter your name here