Evaluating Attribution Methods in Machine Learning Interpretability

Qudrat E. Alahy Ratul, Edoardo Serra, Alfredo Cuzzocrea

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Interpretability is a key feature to broaden a conscious adoption of machine learning models in domains involving safety, security, and fairness. To achieve the interpretability of complex machine learning models, one approach consists in explaining the outcome of machine learning models through input features attribution. Attribution consists in scoring the features of an input instance by establishing how important is each feature value in a fixed instance to obtain a specific classification outcome from the machine learning model. In literature, several attribution methods are defined for specific machine learning models (e.g., neural networks) or more general ones that are model agnostic (i.e., can interpret any machine learning models). Attribution is particularly appreciated for its easy understanding of the interpretation, which is the attribution. In domains involving safety, security, and fairness, properties of the explanation such as precision and generality are crucial to establish human trust in machine learning interpretability and then on the machine learning model itself. However, even if precision and generality are clearly defined in rule-based interpretation models, they are not defined or measure on attribution models. In this work, we propose a general methodology to estimate the degree of precision and generality in attribution methods. In addition, we propose a way to measured consistency in attribution between two attribution methods. Our experiments focus on the two most popular model agnostic attribution methods, SHAP and LIME, and we evaluate them to two real applications in the field of attack detection. Our proposed methodology shows in these experiments that both SHAP and LIME lack precision, generality, and consistency and that still more investigation in the attribution research field is required.

Original languageAmerican English
Title of host publicationProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
EditorsYixin Chen, Heiko Ludwig, Yicheng Tu, Usama Fayyad, Xingquan Zhu, Xiaohua Tony Hu, Suren Byna, Xiong Liu, Jianping Zhang, Shirui Pan, Vagelis Papalexakis, Jianwu Wang, Alfredo Cuzzocrea, Carlos Ordonez
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5239-5245
Number of pages7
ISBN (Electronic)9781665439022
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Big Data, Big Data 2021 - Virtual, Online, United States
Duration: 15 Dec 202118 Dec 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021

Conference

Conference2021 IEEE International Conference on Big Data, Big Data 2021
Country/TerritoryUnited States
CityVirtual, Online
Period15/12/2118/12/21

Keywords

  • Attribution Methods
  • Evaluation Methodology
  • Machine Learning Interpretability

EGS Disciplines

  • Computer Sciences

Fingerprint

Dive into the research topics of 'Evaluating Attribution Methods in Machine Learning Interpretability'. Together they form a unique fingerprint.

Cite this