2.1 Definition
There is no agreement about the definition of interpretability in the ML community [1], and many scholars do not differentiate interpretability and explainability [2]. [3] draw a clear boundary between interpretable and explainable ML: interpretable ML focuses on designing inherently interpretable models and explainable ML tries to provide post hoc explanations for existing black-box models. While [4] equates interpretability and explainability and define interpretability as the degree to which a human can understand the cause of a decision, [2] define interpretability as the degree to which a human can understand the cause of a decision and explainability as the degree to which a human can understand the cause of a decision and predict the model’s result. In addition, he adopts the definition of explanation as the post-hoc interpretability. Combining these definitions, we consider explainability as post-hoc interpretability, and adopts [4]’s definition of interpretability and explanation. As for interpretale machine learning, we refer to the definition by [5]: “extraction of relevant knowledge from a machine-learning model concerning relationships either contained in data or learned by the model”.
A daily example