Explainable AI -- Definition, Motivation and Application

I have been reading papers and articles and searching for ideas of my Ph.D. Qualification Exam (PQE) for a few days. Since I am interested in working on the interdisciplinary field of Visualization and Machine Learning, the idea of “explainable AI” (XAI) seems promising to me. After discussing with my professor, I decided to fixed the survey topic to “Visualization for Explainable Machine Learning”. This blog summarizes my understanding on the motivation, scope and application of XAI.


Clearly, the term “explainable AI” refers to the “AI” that are “explainable”. With the boom of machine learning and deep learning, today’s AI covers a broad range of models and techniques. Thus, the term AI is hard to be clearly defined. Since I am not going to discuss how to better define AI, here the AI refers to the systems or models that utilize modern machine learning techniques rather than the general AI as coined by John McCarthy.

The next question is, how we define “explainable”?

The Merriam Webster Dictionary defines “explain” as “to give the reason for or cause of”. Based on this definition, I roughly describe “explainable AI” as the AI that can provide the reason (of it’s prediction/action) to a human so that the human can understand it.


Now that we have the definition (or description) of XAI, why we would want to have it? Isn’t the black-boxed machine learning models already good enough? Why bother to make them explainable? Indeed, we don’t quite need a digit recognizer to be able to explain the reason of its prediction, since we want it fully automatic without costing human labor. But we do have the need for critical situations where fully automatic systems are not available. In these situations, human users need to understand, appropriately trust, and effectively operate with the upcoming AIs (DARPA, 2016). For example, a doctor utilizing AI tools which helps him/her determine whether the patient has caught cervical cancer will need the AI tools to be explainable. In such life-critical situations, explainability can help the doctor build trust on the system and correct wrong diagnosis if necessary, so that the doctor can boost efficiency while keeping the diagnosing quality. More applications with the need of XAI will be discussed below.

The need for XAI, in my mind, results from the recent success of AI technology as well as its limitations. The new AI techniques are believed to be promising in many fields, including finance, hospital and medicine, education, entertainment, and transportation. However, these successful new techniques, including SVM, probabilistic graph model, random forest, deep learning and reinforcement learning, are difficult to interpret. That’s why these models are sometimes criticized as black-boxes. In cases where AI are used as a collaborator with a human rather than a completely automatic solution, a monitoring system that can sufficiently explain the AI is crucial.


Doshi-Velez and Kim (2017) argued that not all ML systems require interpretability (or explainability), but explainations may highlight “incompleteness” in problem formulation. This incompleteness refers to the biases that we cannot quantified, and thus cannot be optimized. This reminds me of the famous saying of George Box: “All models are wrong, but some are useful”. Considering that all models are built on the assumptions that we can not fully validate in the real world, explainability is needed when such incompleteness is not neglectable. A list of scenarios that need explainability is provided by Doshi-Velez and Kim (2017). A summarized list is as below:

  • Scientific Understanding. This is for researchers and teachers. Researchers need to gain more understandings on models’ behavior and mechanisms. Teachers can utilize explainable AI to help students get essential understanding on AI systems’ behavior.

  • Safety. In some real tasks, the ML systems are not always testable. Thus explainability is desirable for those who operate or monitor the system. Highly related industries includes military, finance and medical services.

  • Ethics. E.g. A “fair” classifier for loan approval. Sometimes, performance targeted end-to-end classifiers may be biased to gender and race, which is unfair (政治正确).

  • Mismatched objectives. Some times an ML system may be optimized for one objective (classifying possible engine failures) but used for broader scenarios (building better automatic cars).

  • Multi-objective trade-offs. In real tasks, accuracy is not the only objective, we also want reliability, unbiasedness. There is still a lack of effective algorithms that can help optimizing such multi-objectives. Explainable systems can leave this tasks for human.