TY - GEN
T1 - Evaluating Attribution Methods in Machine Learning Interpretability
AU - Alahy Ratul, Qudrat E.
AU - Serra, Edoardo
AU - Cuzzocrea, Alfredo
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Interpretability is a key feature to broaden a conscious adoption of machine learning models in domains involving safety, security, and fairness. To achieve the interpretability of complex machine learning models, one approach consists in explaining the outcome of machine learning models through input features attribution. Attribution consists in scoring the features of an input instance by establishing how important is each feature value in a fixed instance to obtain a specific classification outcome from the machine learning model. In literature, several attribution methods are defined for specific machine learning models (e.g., neural networks) or more general ones that are model agnostic (i.e., can interpret any machine learning models). Attribution is particularly appreciated for its easy understanding of the interpretation, which is the attribution. In domains involving safety, security, and fairness, properties of the explanation such as precision and generality are crucial to establish human trust in machine learning interpretability and then on the machine learning model itself. However, even if precision and generality are clearly defined in rule-based interpretation models, they are not defined or measure on attribution models. In this work, we propose a general methodology to estimate the degree of precision and generality in attribution methods. In addition, we propose a way to measured consistency in attribution between two attribution methods. Our experiments focus on the two most popular model agnostic attribution methods, SHAP and LIME, and we evaluate them to two real applications in the field of attack detection. Our proposed methodology shows in these experiments that both SHAP and LIME lack precision, generality, and consistency and that still more investigation in the attribution research field is required.
AB - Interpretability is a key feature to broaden a conscious adoption of machine learning models in domains involving safety, security, and fairness. To achieve the interpretability of complex machine learning models, one approach consists in explaining the outcome of machine learning models through input features attribution. Attribution consists in scoring the features of an input instance by establishing how important is each feature value in a fixed instance to obtain a specific classification outcome from the machine learning model. In literature, several attribution methods are defined for specific machine learning models (e.g., neural networks) or more general ones that are model agnostic (i.e., can interpret any machine learning models). Attribution is particularly appreciated for its easy understanding of the interpretation, which is the attribution. In domains involving safety, security, and fairness, properties of the explanation such as precision and generality are crucial to establish human trust in machine learning interpretability and then on the machine learning model itself. However, even if precision and generality are clearly defined in rule-based interpretation models, they are not defined or measure on attribution models. In this work, we propose a general methodology to estimate the degree of precision and generality in attribution methods. In addition, we propose a way to measured consistency in attribution between two attribution methods. Our experiments focus on the two most popular model agnostic attribution methods, SHAP and LIME, and we evaluate them to two real applications in the field of attack detection. Our proposed methodology shows in these experiments that both SHAP and LIME lack precision, generality, and consistency and that still more investigation in the attribution research field is required.
KW - Attribution Methods
KW - Evaluation Methodology
KW - Machine Learning Interpretability
UR - http://www.scopus.com/inward/record.url?scp=85125305613&partnerID=8YFLogxK
U2 - 10.1109/BigData52589.2021.9671501
DO - 10.1109/BigData52589.2021.9671501
M3 - Conference contribution
T3 - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
SP - 5239
EP - 5245
BT - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
A2 - Chen, Yixin
A2 - Ludwig, Heiko
A2 - Tu, Yicheng
A2 - Fayyad, Usama
A2 - Zhu, Xingquan
A2 - Hu, Xiaohua Tony
A2 - Byna, Suren
A2 - Liu, Xiong
A2 - Zhang, Jianping
A2 - Pan, Shirui
A2 - Papalexakis, Vagelis
A2 - Wang, Jianwu
A2 - Cuzzocrea, Alfredo
A2 - Ordonez, Carlos
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Big Data, Big Data 2021
Y2 - 15 December 2021 through 18 December 2021
ER -