hidimstat.conditional_sampling.ConditionalSampler#
- class hidimstat.conditional_sampling.ConditionalSampler(model_regression=None, model_categorical=None, data_type: str = 'auto', categorical_max_cardinality=10)[source]#
Bases:
object
- __init__(model_regression=None, model_categorical=None, data_type: str = 'auto', categorical_max_cardinality=10)[source]#
Class use to sample from the conditional distribution $p(X^j | X^{-j})$.
- Parameters:
- model_regressionsklearn compatible estimator, optional
The model to use for continuous data.
- model_categoricalsklearn compatible estimator, optional
The model to use for categorical data. Binary is considered as a special case of categorical data.
- data_typestr, default=”auto”
The variable type. Supported types include “auto”, “continuous”, and “categorical”. If “auto”, the type is inferred from the cardinality of the unique values passed to the fit method.
- categorical_max_cardinalityint, default=10
The maximum cardinality of a variable to be considered as categorical when data_type is “auto”.
- fit(X: ndarray, y: ndarray)[source]#
Fit the model that estimates $mathbb{E}[y | X]$.
- Parameters:
- Xndarray
The variables used to predict the group of variables $y$.
- yndarray
The group of variables to predict.
- sample(X: ndarray, y: ndarray, n_samples: int = 1, random_state=None) ndarray [source]#
Sample from the conditional distribution $p(X^j | X^{-j})$.
- Parameters:
- Xndarray
The complementary of the considered set of variables, $X^{-j}$.
- yndarray
The group of variables to sample, $X^j$.
- n_samplesint, optional
The number of samples to draw.
- random_stateint, default=None
The random state to use for sampling.
- Returns:
- y_conditionalndarray
The samples from the conditional distribution.
Examples using hidimstat.conditional_sampling.ConditionalSampler
#

Pitfalls of Permutation Feature Importance (PFI) on the California Housing Dataset