Exploring Causal Inference Techniques in AI and Machine Learning
Written on
Chapter 1: Understanding Causal Relationships
This chapter delves into various methodologies for querying data to derive interpretable causal inferences.
Citing research and insights into causal inference methodologies can enhance our understanding of AI applications.
Section 1.1: Causal Techniques in AI
To uncover causal relationships, two primary techniques are often employed: Graphical Methods (including Knowledge Graphs and Bayesian Belief Networks) and Explainable AI. These methodologies establish the foundation for the Association level within the Causality Hierarchy, enabling inquiries such as the distinct properties of an entity and their interconnections.
If you're keen to explore how causality integrates with Machine Learning, check out my prior article: Causal Reasoning in Machine Learning.
Section 1.2: Knowledge Graphs
Knowledge Graphs serve as a crucial graphical technique that efficiently stores and retrieves pertinent information from extensive datasets. They find applications in search engines, e-commerce platforms, and social networks. In our previous case study on Recommendation Systems, researchers like Yikun Xian et al. have recently applied Knowledge Graphs in the realm of causality to facilitate causal inference-based recommendations.
For instance, when using a search engine to learn about Leonard Nimoy (the actor who portrayed Spock in Star Trek), the search engine constructs a Knowledge Graph based on the query, expanding to gather related data.
A notable application of Knowledge Graphs is in developing Machine Learning models that learn from causal relationships. Knowledge Graph Convolutional Networks (KGCN) exemplify a successful implementation in this area. These networks generate an embedded representation of a Knowledge Graph, which can subsequently be utilized by Machine Learning models to create inference pathways and substantiate predictions.
Section 1.3: Bayesian Belief Networks
Bayesian Belief Networks represent a type of probabilistic model that simplifies the connections among various elements to calculate their probability relationships efficiently. By examining the interactions of these elements, one can uncover causal links. In a Bayesian Network, nodes symbolize variables while edges indicate the probabilistic relationships among them.
A basic illustration of a three-variable Bayesian Belief Network is provided below.
These networks can express both dependent and independent connections between variables, adhering to the Markov condition. By applying Bayes' probabilistic approach, one can iteratively update connection probabilities as new evidence emerges.
Great efforts are underway at organizations like DeepMind to leverage Bayesian Belief Networks in developing Causal Bayesian Networks (CBN). CBNs help visually identify and quantitatively measure biases in datasets, revealing elements that may lead Machine Learning models to favor specific subcategories.
The first video, "Causal Inference | Answering causal questions," provides a comprehensive overview of these techniques, shedding light on their significance in AI.
Section 1.4: Explainable AI
A critical consideration in contemporary Machine Learning is the balance between model performance and complexity. Generally, sophisticated Deep Learning architectures outperform traditional linear classifiers and regression techniques across various tasks. This dichotomy was thoroughly examined in the 2016 paper "Why Should I Trust You?" by Ribeiro et al., which sparked a movement towards prioritizing interpretability in AI.
Complex models, often termed Black-boxes, are challenging to interpret and do not readily indicate the importance of individual features or their interrelations. In contrast, simpler models like decision trees and linear regression are categorized as White-boxes and offer greater transparency.
Section 1.5: Surrogate Models
To enhance model explainability, one approach involves creating surrogate models—simplified approximations that can be global or local.
- Global Surrogate Model: This model provides a linear approximation of a complex, non-linear model valid for all inputs. However, if the original model is highly non-linear, this might result in suboptimal performance.
- Local Surrogate Model: Ideal for approximating non-linear models, this method segments the feature space into linear sections, applying linear approximations for each segment.
Local Surrogate Models are also known as Local Interpretable Model-Agnostic (LIME) techniques.
In the illustration below, we see how a fitted curve might differ between standard black-box models and surrogate techniques.
Alternative strategies to improve model explainability include Feature Importance metrics, Shapley additive explanations (SHAP), Partial Dependence Plots (PDP), and Gradient/Attention-based methods.
Chapter 2: Addressing Bias in AI Models
The necessity for Explainable and Causal-based Machine Learning models is underscored by the need to identify and mitigate biases, which can result in unfair discrimination against certain classes. Bias may arise from either the training dataset's limitations or the model's architecture itself. Types of biases include Interaction Bias, Latent Bias, and Selection Bias.
The second video, "Why is causality key to making AI robust and trustworthy?" discusses the importance of causality in developing fair and reliable AI systems.
Contacts
For updates on my latest articles and projects, connect with me on Medium and subscribe to my mailing list. Here are my contact details:
- Personal Blog
- Personal Website
- Medium Profile
- GitHub
- Kaggle
Bibliography
[5] Introduction to Causal Inference. Peter Spirtes, Department of Philosophy, Carnegie Mellon University. Accessed: http://www.jmlr.org/papers/volume11/spirtes10a/spirtes10a.pdf July 2020.