3.1. Total Sobol Index#
The Total Sobol Index (TSI) Sobol[1] is a widely used measure of feature importance that is rooted in sensitivity analysis. It can intuitively be defined by the performance drop of a predictive model when a feature is removed, similarly to an ablation study. In general, the TSI can be expressed as:
\[\psi_j^{TSI} = \mathbb{E} \left[\mathcal{L}\left(y, \mu(X)\right)\right] - \mathbb{E} \left[\mathcal{L}\left(y, \mu_{-j}(X^{-j})\right)\right],\]
where \(X^{-j}\) denotes the feature vector without the \(j^{th}\) feature, and \(\mu_{-j}(X^{-j})\) is the same predictive model as \(\mu(X)\) but retrained on the reduced feature set \(X^{-j}\). When \(\mathcal{L}\) is the squared loss, for a regression task, \(\mu_{-j}(X^{-j}) = \mathbb{E}[y | X^{-j}]\) and when \(\mathcal{L}\) is the log-loss, for a classification task, \(\mu_{-j}(X^{-j}) = P(y | X^{-j})\).