##plugins.themes.bootstrap3.article.main##

In recent years, research has been active in various fields to measure and collect spectrum data on the moisture content of a wide variety of plants and animals, beauty products, concrete, cement, etc., and to clearly display this data using a display method known as an aquagram. In light of this trend, in this thesis study, we propose a method for the automatic classification of aquagrams using various exploitable artificial intelligence (XAI)-based programming techniques. In doing so, we show and explain the process of their classification and the fact that it is possible to show the indicative value of the validity and rationale of the classification, to a certain extent. We have selected XAI based on Explain Like SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), and Light Gradient Boosting Machine (LightGBM), I’m 5 (ELI5), Partial Dependency Plot box (PDPbox), and Skater to analyze diverse datasets, in this study, in particular, aquagram datasets. We intend to thereby present the field with a numerical method to illustrate the seemingly obscure processes and arguments of machine learning, particularly deep learning, classification, which will be useful for future research. Concretely, after investigating the previously obtained matrix-formed aquagram data, we describe the case of explicit classification by machine learning for four different groups of datasets on skin moisture content and moisture transpiration. The programs we use for these are all coded in Python and import and use packages such as pandas, pickle, etc.

Introduction

Over the past several years, we have developed methods and systems for analyzing and classifying data related to skin characteristics, with a particular focus on skin moisture data. These efforts aim to address challenges associated with advancing technical capabilities, promoting knowledge sharing, and supporting educational initiatives in the context of skin dataset-based research, particularly within the field of aquaphotomics.

In this study, we begin by obtaining fundamental data on the matrices of aquagrams. Following this, we provide a comprehensive overview of several explainable artificial intelligence (XAI) libraries. XAI represents a specialized approach within data science, aimed at elucidating the reasoning processes underlying AI model predictions. In certain cases, these explanations can assist users in better understanding and adopting AI-based solutions. However, effective application of XAI methods requires a clear understanding of their usage and capabilities.

To this end, we propose utilizing sample data in conjunction with key techniques from various XAI libraries to analyze and interpret the behavior of complex AI models. Specifically, we explore the applications of Explain Like SHapley Additive exPlanations (SHAP), Local Interpretable Model-Agnostic Explanations (LIME), the Light Gradient Boosting Machine (LightGBM), ELI5, the Partial Dependency Plot Box (PDPbox), and Skater [1]–[23].

SHAP provides a mechanism for interpreting individual model predictions by quantifying and explicitly presenting the contribution of each feature to the predicted outcome. Similarly, LIME enables localized interpretations by highlighting the features of a specific data point that contributed to a model’s prediction, offering flexibility across various AI models and input data types. LightGBM, on the other hand, is a high-performance gradient boosting framework that constructs predictive models using a series of decision trees, making it particularly suitable for tabular data.

By integrating these XAI tools, this study aims to enhance the transparency, interpretability, and usability of AI models, particularly in applications involving complex datasets, thereby contributing to advancements in the fields of skin research and aquaphotomics.

SHAP is a method for explaining the results of individual predictions of a model; it is a method, similar to LIME, in which the contribution of each characteristic quantity to the prediction is calculated and explicitly presented in numerical form. It is based on the “SHapley value” in game theory, which calculates the contribution of individual players. This method has the desirable and natural property that “the sum of the contributions of each feature corresponds to the predicted value.” In general, it is often difficult to calculate SHapley values efficiently, however with SHAP, the contribution can be calculated by devising algorithms and utilizing the characteristics of the model being explained. In some cases, however, this would only result in the calculation of approximate or pseudo values.

LIME is the calculation and display of numerical values related to the prediction results by an AI model for any one explanatory data, regarding the features of the data that contributed to the prediction. This approach is called a “local explanatory technique.” There are essentially no restrictions on the AI models that can be explained. In terms of the principle of operation, they can be used for any input data. In particular, libraries corresponding to tabular data, image data and text data are available to users.

For LightGBM, it is a model in which a large number of decision trees are connected in series. It is fast and highly accurate and has become increasing powerful in recent years. Owing to these characteristics and its applicability for predictive models with table data, we adopt it in this study.

In addition, we propose utilizing the ELI5 framework to quantitatively measure feature importance and assess the degree to which specific features are emphasized using Permutation Importance, followed by a discussion of the results. Partial dependence plots (individual conditional expectation) are generated using the PDPbox library, while Skater-based computations are conducted employing the TreeSurrogate model. To further analyze the internal decision-making processes of the AI model, we incorporate conditional branching within decision trees and subsequently discuss the outcomes.

Proposed Method and Usefulness

Method

In this study, we propose the ways to combine our past diverse skin datasets, particularly the aquagrams with recent XAI-based analysis methodologies.

We have obtained subjects’ (i) skin moisture and (ii) evaporation data from measurements at beauty salons of Dr. Recella Co., Ltd. (Higashiyodogawa-ku, Osaka, Japan). Besides, as for each subject, we captured diverse information, particularly (i) the average amount of water using Corneometer TM CM825, Courage + Khazaka electronic GmbH, Inc. (Köln, Germany): the manufacturer’s own measurement units for this measuring instrument, (ii) ratio of epidermal water loss (transpiration) [g/m2/h] collected using Tewameter, Courage + Khazaka electronic GmbH, Inc. (Köln, Germany) presented in Figs. 1 and 2.

Fig. 1. Corneometer TM CM825, Courage + Khazaka electronic GmbH, Mathias-Brüggen-Str. 91, 50829 Köln, Germany. ( https://www.courage-khazaka.com/en/scientific-products/corneometer-cm-825).

Fig. 2. Tewameter TM Hex, Courage + Khazaka electronic GmbH, Mathias-Brüggen-Str. 91, 50829 Köln, Germany. ( https://www.courage-khazaka.com/en/downloads-en/item/prospekt-tm300-e).

In this study, we selected 44 subjects. The classification of skin quality based on the water content and transpiration rate of the epidermis can be categorized into the following four groups (Figs. 3 and 4):

Group 1: The epidermis exhibits high water content and a high transpiration rate (N = 11).

Group 2: The epidermis exhibits high water content and a low transpiration rate. This skin type is generally considered ideal (N = 11).

Group 3: The epidermis exhibits low water content and a low transpiration rate (N = 11).

Group 4: The epidermis exhibits low water content and a high transpiration rate. This skin type is generally considered suboptimal or poor (N = 11).

Fig. 3. Two-axial scatter plot incorporating boundary values.

Fig. 4. Average values for WAMACs and charts of aquagrams for the four groups presented earlier.

Theory

A In this study, we utilize XAI-based methodologies and explore what is happening behind the algorithms. In response to recent trends in XAI, we delineate five fundamental aspects for evaluating the outputs of XAI systems.

These aspects are defined as follows:

1. Description Fidelity: This metric evaluates the extent to which the XAI system can faithfully reproduce the behavior of the original, complex AI model it is designed to explain.

2. Reliability of the Description: This metric assesses the reliability of the explanations provided by the XAI system, particularly from the perspective of the end-user or recipient of the explanation.

3. Satisfaction with Description: This index measures the degree to which the explanations provided by the XAI system meet the user’s expectations and contribute to overall user satisfaction.

4. Mental Model: This aspect examines the psychological impact of the explanations on the recipient, focusing on how the explanations influence the user’s understanding and mental representation of the AI model’s functionality. A dedicated index is used for this evaluation.

5. Affinity for Real Systems and the Real World: This metric evaluates the practical applicability and relevance of the XAI system in real-world contexts, specifically its utility in relation to AI systems operating in practical environments. These five dimensions collectively provide a comprehensive framework for assessing the quality, reliability, and applicability of XAI outputs, thereby facilitating a deeper understanding of their effectiveness in real-world scenarios.

Phase 1

SHAP, LIME, ELI5, and Permutation Importance

SHAP is a method designed to explain individual predictions by quantifying the contribution of each feature to the predicted outcome. This technique is grounded in the Shapley value from cooperative game theory, which provides a framework for fairly distributing a reward among cooperating players. In the context of SHAP, the “reward” corresponds to the predicted value, and the “players” represent the instance’s feature values.

To compute Shapley values, SHAP simulates scenarios in which subsets of feature values are present while others are considered absent. By evaluating the changes in predictions across these scenarios, SHAP calculates the marginal contributions of each feature. For general matrix datasets, each feature value is treated as an individual player. However, in other cases, a group of feature values may collectively act as a single player. This approach ensures an equitable distribution of the predicted value across features, facilitating a transparent and interpretable explanation of the AI model’s behavior.

The Local Surrogate Model is a technique employed within the framework of LIME to interpret individual predictions generated by s machine learning models. Using a surrogate model, LIME approximates the underlying prediction mechanism of the black-box model. By adopting a local rather than a global surrogate model, LIME is able to provide explanations specific to individual predictions.

Initially, the process begins by considering a black-box model that outputs a prediction when provided with input data. This black-box model can be queried as many times as necessary to investigate its behavior. The primary goal is to understand why the machine-learning model produces specific predictions. To achieve this, LIME perturbs the input data of the model and analyzes how these perturbations influence the predicted outcomes.

LIME creates a new dataset by systematically replacing feature values in the original sample. This newly generated dataset is then used to simulate and predict the outputs of the black-box model. Subsequently, an interpretable model is trained on the generated dataset, with weights assigned based on the proximity of each instance to the original data point of interest. Several methods can serve as interpretable models, including Lasso regression and decision trees. Locally, this trained surrogate model provides a reliable approximation of the black-box model’s behavior, though it does not serve as a globally accurate representation.

Despite its effectiveness, LIME fundamentally optimizes a loss function during its operation. This optimization involves selecting the maximum number of features utilized by the linear regression model, reflecting its complexity. Thus, it is essential to determine and control the model’s complexity to ensure interpretability and utility.

The process of training a local surrogate model proceeds as follows. First, the instance of interest—where the black-box prediction is to be explained—is selected. The input data is then perturbed to create new data points, for which predictions are obtained from the black-box model.

The LightGBM framework supports diverse algorithms (e.g., Gradient Boosting Trees (GBT), Gradient Boosting Decision Trees (GBDT), Gradient Boosting Regression Trees (GBRT), Gradient Boosting Machine (GBM), Multiple Additive Regression Trees (MART)). LightGBM shares several advantages with eXtreme Gradient Boosting (XGBoost), including features such as sparse optimization, parallel training, support for multiple loss functions, regularization techniques, bagging, and early stopping mechanisms. A key distinction between the two algorithms lies in their approach to tree construction. Unlike many other implementations, LightGBM does not grow trees in a level-wise manner. Instead, it employs a leaf-wise growth strategy, wherein the algorithm selects the leaf that is expected to produce the greatest reduction in loss at each step.

PDPbox

Partial dependency plots (PDPs) are a visualization tool used to examine the marginal effect of one or two features on the predictions generated by a machine learning model. PDPs provide insights into the nature of the relationship between input features and model output, indicating whether the relationship is linear, monotonic, or more complex.

In this context, we focus on classification tasks where the machine learning model outputs probability values. For such tasks, PDPs depict the probability of a specific class as a function of varying values for the selected features. It is important to emphasize that PDPs are a global interpretability method, as they take into account all instances in the dataset when constructing the plot.

For categorical features, the computation of partial dependence is straightforward. This involves assigning all instances within a dataset to a single category value at a time. By forcing this alignment, we can compute PDPs effectively for categorical variables.

Skater, Tree Surrogate

Here, we discuss interpretable models with Tree Surrogates using Skater. There are various ways to interpret machine learning models such as with features, dependence plots, and even LIME.

However, we cannot build an approximation or a surrogate model that is more interpretable from a very complex black box model, such as a XGBoost model with hundreds of decision trees.

Program

Phase 1: The program consists of four distinct phases, each with specific objectives and steps

1. Environment Setup and Data Preparation:

• Establish a Python-based environment and install required libraries, including those for XAI (e.g. LightGBM (Fig. 5), numpy, pandas, matplotlib, scikit-learn, seaborn).

• Preprocess target data through formatting, statistical analysis, and visual confirmation.

• Check and address dataset issues such as missing data and calculate correlation coefficients.

2. Model Preprocessing and Training:

• Preprocess tabular datasets and configure variables and functions for XAI model learning.

• Train models, calculate feature quantities, and evaluate accuracy. Repeat preprocessing if accuracy is insufficient.

3. Local Explanation Analysis with LIME:

• Use LIME to analyze model predictions and generate local explanations.

• Prepare visualization outputs (e.g. bar plots, numerical data) and confirm their validity.

• Repeat these steps for other explanatory data.

4. Kernel Width Sensitivity Analysis:

• Analyze the effect of varying kernel width on LIME’s local explanations.

• Address variability in results due to parameter changes and ensure parameter settings yield satisfactory explanatory data for both analysts and end-users.

Fig. 5. Example of actual predictions utilizing the LightGBM model [3], [5].

Phase 2: ELI5 and Permutation Importance (PI) for Broad and Local Explanations

ELI5 is a versatile tool compatible with various AI algorithms, particularly valuable for models lacking built-in visualization for feature importance. It supports the calculation of Permutation Importance (PI), enabling both global and local explanations for classification and regression models.

1. Global Explanation:

• ELI5 enables visualization of the importance of features used by the AI model for classification and regression tasks.

• Internal model parameters are examined to identify the most influential features globally.

2. Local Explanation:

• ELI5 computes PI for individual predictions, providing localized insights into model behavior.

• It outputs explanations specific to individual prediction outcomes, detailing the contribution of features to the model’s decisions.

3. Parameter Tuning:

• ELI5 allows parameter adjustments to refine feature importance calculations.

• The gain parameter is utilized to assess the contribution of feature importance to model accuracy, confirming the discriminatory power of specific features.

• This dual capability makes ELI5 a robust tool for comprehensively analyzing model behavior and feature contributions.

Phase 3: PDPbox

From Phase 1, we can understand the overlapping feature amounts, however that is not sufficient for an adequate explanation. Thus, in this phase, we verify what kind of change occurs in the prediction result due to changes in the feature amount. Here, we use a PDPbox library that outputs a PDP. A PDPbox outputs changes in the prediction of the model caused by any variable, such as a PDP or Individual Conditional Expectation (ICE) (Fig. 6).

Fig. 6. Example of line graph of output data using PDP (ICE) [3], [5].

We show the output data using a Python-language program (scikit learn), which allows us to use operators and biaxial graphs. We also address supervised learning algorithms that are compatible with them.

The PDP and ICE outputs of the PDPbox correspond only to a combination of one and two variables (Fig. 7). We provide both PDP and ICE outputs.

Fig. 7. Example of block chart formed by a decision-tree generated from surrogate-model-based analyses using Skater. (Accuracy of surrogate-model = 0.85) [3], [5].

Here, we show (i) the graph of the impact of a single feature, and (ii) the graph of the influence of the interaction of two features.

We can specify a PDP in the program and draw a two-axis graph within the range of standard deviation. We also specify ICE and draw plots of results for each individual dataset in a two-axis graph.

Phase 4: Skater, TreeSurrogate

For skater framework for model explanation, skater is a comprehensive explanatory tool used to understand AI model prediction processes by generating surrogate models, particularly decision trees. The framework provides functions for both broad and local interpretations, with key features including:

1. Broad Explanations:

• PartialDependence, FeatureImportance

2. Local Explanations:

• LimeTabularExplainer, DeepInterpreter

3. Both Broad and Local Explanations:

• Bayesian Rule List Classifier (BRLC), TreeSurrogate

• Skater can also be integrated with other algorithms like LIME to enhance model interpretability. In this study, we specifically focus on using TreeSurrogate. We apply decision trees trained on the target model’s predictions to approximate and explain the AI’s judgment process. The model is used for supervised learning discrimination and is independent of the analysis target.

Discussion

In this study, we demonstrated the utility and advantages of feature importance analysis using tools and methods such as SHAP, LIME, LightGBM, ELI5, PDPbox, and Skater. These techniques are undoubtedly valuable for supporting a wide range of algorithms.

However, as this field is still emerging, there is potential for further improvements in accuracy and stability. By adjusting parameters for calculating feature importance, we could confirm the significance of feature quantities and observed their impact on AI-model behavior.

Through parameter gain analysis, we could visually assessed feature quantities and their influence on classification accuracy. Using PDPbox, we visualized the predicted probabilities for features and generated ICE plots for individual data points.

Additionally, we could illustrate the effects of feature quantity changes on model predictions using planar graphs, such as Information Plots and PDPs. Moreover, PDPbox enabled us to identify changes in feature characteristics that occur under specific conditions, which were revealed through selective data subsets. For the target data analyzed in this study, PDPbox proved to be an effective tool for detailed exploration of the relationship between AI model predictions and feature quantities.

Conclusion and Future Tasks

Thinking of verification and future directions for XAI-based systems and aquaphotomics datasets, in this study, we verified the fundamental operations of XAI techniques using SHAP, LIME, and LightGBM systems.

Additionally, we evaluated the applicability of XAI-based methods, such as ELI5, PDPbox, and Skater, in conjunction with aquaphotomics datasets. We demonstrate the potential of these approaches for providing insights into complex data structures and prediction models.

For future work, we propose expanding the scope of analysis by incorporating a wider range of aquaphotomics-related information and integrating recent advancements in data analysis and methodology. These enhancements are expected to refine the accuracy and applicability of XAI techniques in this domain. Furthermore, broadening the geographical scope of application by launching initiatives in other countries is planned.

While our recent trials conducted to date have posed significant challenges, they hold considerable promise for contributing to the global cosmetics and beauty industries. The outcomes of this research have the potential to support those industry professionals by providing enhanced data-driven insights and advancing the field of aquaphotomics.

References

  1. Hariharan S, Rejimol Robinson RR, Prasad RR, Thomas C, Balakrishnan N. XAI for intrusion detection system: comparing explanations based on global and local scope. J Comput Virol Hacking Tech. 2022;19(2):1–23. doi: 10.1007/s11416-022-00441-2.
     Google Scholar
  2. Das A, Rad P. Opportunities and challenges in explainable artificial intelligence(XAI): a survey.ArXiv preprint arXiv: 2006.2020;11371.
     Google Scholar
  3. Kawakura S, Hirafuji M, Ninomiya S, Shibasaki R. Analyses of diverse agricultural worker data with explainable artificial intelligence: XAI based on SHAP, LIME, and LightGBM. Eur J Agr Food Sci. 2022;4(6):11–9. doi: 10.24018/ejfood.2022.4.6.348.
     Google Scholar
  4. Vollert S, Atzmueller M, Theissler A. Interpretable machine learning: a brief survey from the predictive maintenance perspective. Proceedings of 26th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), pp.1–8, 2021.
     Google Scholar
  5. Kawakura S, Hirafuji M, Ninomiya S, Shibasaki R. Adaptations of explainable artificial intelligence (XAI) to agricultural data models with ELI5, PDPbox, and skater using diverse agricultural worker data. Eur J Artif Intell Mach Learn. 2022;1(3):27–34. doi: 10.24018/ejai.2022.1.3.14.
     Google Scholar
  6. Islam SR, Eberle W, Ghafoor SK, Ahmed M. Explainable artificial intelligence approaches: a survey.arXiv preprint arXiv: 2101.09429. 2021.
     Google Scholar
  7. Dindorf C, Konradi J, Wolf C, Taetz B, Bleser G, Huthwelker J, et al. Classification and automated interpretation of spinal posture data using a pathology-independent classifier and explainable artificial intelligence (Xai). Sensors. 2021;21(18):23–63. doi: 10.3390/s21186323.
     Google Scholar
  8. Galhotra S, Pradhan R, Salimi B. Explaining black-box algorithms using probabilistic contrastive counterfactuals. Proceedings of the 2021 International Conference on Management of Data, pp. 577–90, 2021.
     Google Scholar
  9. Bücker M, Szepannek G, Gosiewska A, Biecek P. Transparency, auditability, and explainability of machine learning models in credit scoring. J Oper Res Soc. 2022;73(1):70–90. doi: 10.1080/01605682.2021.1922098.
     Google Scholar
  10. Goodwin NL, Nilsson SR, Choong JJ, Golden SA. Toward the explainability, transparency, and universality of machine learning for behavioral classification in neuroscience. Curr Opin Neurobiol. 2022;73:102544. doi: 10.1016/j.conb.2022.102544.
     Google Scholar
  11. Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable AI: a review of machine learning interpretability methods. Entropy. 2020;23(1):18: 1–45. doi: 10.3390/e23010018.
     Google Scholar
  12. Li XH, Cao CC, Shi Y, Bai W, Gao H, Qiu L, et al. A survey of data-driven and knowledge-aware explainable AI. IEEE Trans Knowl Data Eng. 2020;34(1):29–49. doi: 10.1109/TKDE.2020.2983930.
     Google Scholar
  13. Saeed W, Omlin C. Explainable AI (xai): a systematic meta-survey of current challenges and future opportunities. ArXiv preprint arXiv:2111.06420. 2021.
     Google Scholar
  14. Duval A. Explainable artificial intelligence (XAI). MA4K9 Scholarly Report, Mathematics Institute, The University of Warwick. 2019. p.1–53.
     Google Scholar
  15. Lai V, Chen C, Liao QV, Smith-Renner A, Tan C. Towards a science of human-ai decision making: a survey of empirical studies. ArXiv preprint arXiv:2112.11471. 2021.
     Google Scholar
  16. Taylor JET, Taylor GW. Artificial cognition: how experimental psychology can help generate explainable artificial intelligence. Psychon Bull Rev. 2021;28(2):454–75. doi: 10.3758/s13423-020-01825-5.
     Google Scholar
  17. Speith T. A review of taxonomies of explainable artificial intelligence(XAI)methods.Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pp.2239–50, 2022.
     Google Scholar
  18. Ali S, Abuhmed T, El-Sappagh S, Muhammad K, Alonso-Moral JM, Confalonieri R, et al. Explainable Artificial Intelligence (XAI): what we know and what is left to attain Trustworthy Artificial Intelligence. Inf Fusion. 2023;99(3):101805. doi: 10.1016/j.inffus.2023.101805.
     Google Scholar
  19. Chamola V, Hassija V, Sulthana R, Ghosh D, Dhingra D, Sikdar B. A review of trustworthy and explainable artificial intelligence (XAI). IEEE Access. 2023;11:78994–9015.
     Google Scholar
  20. Gerlings J, Shollo A, Constantiou I. Reviewing the need for explainable artificial intelligence (xAI). arXiv preprint arXiv: 2012.01007. 2020.
     Google Scholar
  21. Longo L, Brcic M, Cabitza F, Choi J, Confalonieri R, Del SJ, et al. Explainable Artificial Intelligence (XAI) 2.0: a manifesto of open challenges and interdisciplinary research directions. Inf Fusion. 2024;106(3):102301. doi: 10.1016/j.inffus.2024.102301.
     Google Scholar
  22. ˇ Cerneviˇ cien˙ e J, Kabaˇ sinskas A. Explainable artificial intelligence (XAI) in finance: a systematic literature review. Artif Intell Rev. 2024;57(8):216. doi: 10.1007/s10462-024-10854-8.
     Google Scholar
  23. Haque AB, Islam AN, Mikalef P. Explainable Artificial Intelligence (XAI) from a user perspective: a synthesis of prior literature and problematizing avenues for future research. Technol Forecast Soc Change. 2023;186(2):122120. doi: 10.1016/j.techfore.2022.122120.
     Google Scholar


Similar Articles

You may also start an advanced similarity search for this article.