Skip to content

Partitioners

Partitioners create groups/subsets of data for conditional permutation.

ManualPartitioner

Use when you have domain knowledge about how series should be grouped.

ManualPartitioner(mapping, series_col='level')

Bases: BasePartitioner

Partitioner using a user-defined mapping dictionary.

This partitioner assigns samples to groups based on a predefined mapping from series identifiers (or other categorical values) to group labels. Useful when domain knowledge suggests natural groupings.

Example

mapping = {'MT_001': 'group_A', 'MT_002': 'group_B', 'MT_003': 'group_A'} partitioner = ManualPartitioner(mapping, series_col='level') groups = partitioner.fit_get_groups(X, feature='lag_1')

Initialize the manual partitioner.

Parameters:

Name Type Description Default
mapping dict[Any, Any]

Dictionary mapping series identifiers to group labels.

required
series_col str

Name of the column or index level containing series IDs.

'level'
Source code in src/xeries/partitioners/manual.py
def __init__(
    self,
    mapping: dict[Any, Any],
    series_col: str = "level",
) -> None:
    """Initialize the manual partitioner.

    Args:
        mapping: Dictionary mapping series identifiers to group labels.
        series_col: Name of the column or index level containing series IDs.
    """
    self.mapping = mapping
    self.series_col = series_col
    self._fitted = False
    self._group_encoder: dict[Any, int] = {}

n_groups property

Return the number of unique groups.

fit(X, feature)

Fit the partitioner (encodes group labels to integers).

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame.

required
feature str

The feature to condition on (not used for manual partitioner).

required

Returns:

Type Description
ManualPartitioner

Self for method chaining.

Source code in src/xeries/partitioners/manual.py
def fit(self, X: pd.DataFrame, feature: str) -> ManualPartitioner:
    """Fit the partitioner (encodes group labels to integers).

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on (not used for manual partitioner).

    Returns:
        Self for method chaining.
    """
    unique_groups = sorted(set(self.mapping.values()))
    self._group_encoder = {g: i for i, g in enumerate(unique_groups)}
    self._fitted = True
    return self

get_groups(X)

Get group labels for each sample based on the mapping.

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame with series identifiers.

required

Returns:

Type Description
NDArray[intp]

Array of integer group labels.

Raises:

Type Description
ValueError

If partitioner has not been fitted.

KeyError

If series_col is not found in X.

Source code in src/xeries/partitioners/manual.py
def get_groups(self, X: pd.DataFrame) -> NDArray[np.intp]:
    """Get group labels for each sample based on the mapping.

    Args:
        X: Input features DataFrame with series identifiers.

    Returns:
        Array of integer group labels.

    Raises:
        ValueError: If partitioner has not been fitted.
        KeyError: If series_col is not found in X.
    """
    if not self._fitted:
        raise ValueError("Partitioner must be fitted before calling get_groups")

    series_ids = self._get_series_ids(X)
    group_labels = series_ids.map(self.mapping)

    if group_labels.isna().any():
        missing = series_ids[group_labels.isna()].unique()
        raise ValueError(f"Series IDs not found in mapping: {missing.tolist()}")

    encoded = group_labels.map(self._group_encoder)
    return encoded.to_numpy().astype(np.intp)

TreePartitioner

Automatically learns subgroups using a decision tree (cs-PFI algorithm).

TreePartitioner(max_depth=4, min_samples_leaf=0.05, series_col=None, random_state=None)

Bases: BasePartitioner

Partitioner using decision tree leaf nodes for subgroup discovery.

This implements the Conditional Subgroup Permutation Feature Importance (cs-PFI) algorithm. A decision tree is trained to predict the feature of interest using all other features. The leaf nodes of this tree define homogeneous subgroups for conditional permutation.

Example

partitioner = TreePartitioner(max_depth=4, min_samples_leaf=0.05) groups = partitioner.fit_get_groups(X, feature='lag_1')

Initialize the tree partitioner.

Parameters:

Name Type Description Default
max_depth int | None

Maximum depth of the decision tree.

4
min_samples_leaf int | float

Minimum samples required in a leaf node. Can be int (absolute) or float (fraction of total samples).

0.05
series_col str | None

Column with series identifiers to one-hot encode. None (default) auto-detects _level_skforecast (skforecast 0.21+) or level (MultiIndex / legacy). Set to None explicitly is the same as omitting detection only when neither column exists.

None
random_state int | None

Random seed for reproducibility.

None
Source code in src/xeries/partitioners/tree.py
def __init__(
    self,
    max_depth: int | None = 4,
    min_samples_leaf: int | float = 0.05,
    series_col: str | None = None,
    random_state: int | None = None,
) -> None:
    """Initialize the tree partitioner.

    Args:
        max_depth: Maximum depth of the decision tree.
        min_samples_leaf: Minimum samples required in a leaf node.
            Can be int (absolute) or float (fraction of total samples).
        series_col: Column with series identifiers to one-hot encode.
            ``None`` (default) auto-detects ``_level_skforecast`` (skforecast 0.21+)
            or ``level`` (MultiIndex / legacy). Set to ``None`` explicitly is the same
            as omitting detection only when neither column exists.
        random_state: Random seed for reproducibility.
    """
    self.max_depth = max_depth
    self.min_samples_leaf = min_samples_leaf
    self.series_col = series_col
    self.random_state = random_state

    self._tree: DecisionTreeRegressor | None = None
    self._encoder: OneHotEncoder | None = None
    self._feature: str | None = None
    self._fitted = False

n_groups property

Return the number of leaf nodes (groups).

tree property

Return the fitted decision tree.

fit(X, feature)

Fit the decision tree to predict the feature of interest.

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame.

required
feature str

The feature to condition on (will be predicted by tree).

required

Returns:

Type Description
TreePartitioner

Self for method chaining.

Source code in src/xeries/partitioners/tree.py
def fit(self, X: pd.DataFrame, feature: str) -> TreePartitioner:
    """Fit the decision tree to predict the feature of interest.

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on (will be predicted by tree).

    Returns:
        Self for method chaining.
    """
    self._feature = feature

    y_tree = X[feature].values
    X_tree = self._prepare_tree_features(X, feature)

    self._tree = DecisionTreeRegressor(
        max_depth=self.max_depth,
        min_samples_leaf=self.min_samples_leaf,
        random_state=self.random_state,
    )
    self._tree.fit(X_tree, y_tree)
    self._fitted = True

    return self

get_groups(X)

Get leaf node indices as group labels.

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame.

required

Returns:

Type Description
NDArray[intp]

Array of leaf node indices (group labels).

Raises:

Type Description
ValueError

If partitioner has not been fitted.

Source code in src/xeries/partitioners/tree.py
def get_groups(self, X: pd.DataFrame) -> NDArray[np.intp]:
    """Get leaf node indices as group labels.

    Args:
        X: Input features DataFrame.

    Returns:
        Array of leaf node indices (group labels).

    Raises:
        ValueError: If partitioner has not been fitted.
    """
    if not self._fitted or self._tree is None or self._feature is None:
        raise ValueError("Partitioner must be fitted before calling get_groups")

    X_tree = self._prepare_tree_features(X, self._feature)
    return self._tree.apply(X_tree).astype(np.intp)

Base Class

BasePartitioner

Bases: ABC

Abstract base class for data partitioners.

Partitioners create groups/subsets of data for conditional permutation.

fit(X, feature) abstractmethod

Fit the partitioner to the data.

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame.

required
feature str

The feature to condition on.

required

Returns:

Type Description
BasePartitioner

Self for method chaining.

Source code in src/xeries/core/base.py
@abstractmethod
def fit(self, X: pd.DataFrame, feature: str) -> BasePartitioner:
    """Fit the partitioner to the data.

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on.

    Returns:
        Self for method chaining.
    """
    ...

fit_get_groups(X, feature)

Fit and return groups in one step.

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame.

required
feature str

The feature to condition on.

required

Returns:

Type Description
NDArray[intp]

Array of group labels.

Source code in src/xeries/core/base.py
def fit_get_groups(self, X: pd.DataFrame, feature: str) -> NDArray[np.intp]:
    """Fit and return groups in one step.

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on.

    Returns:
        Array of group labels.
    """
    self.fit(X, feature)
    return self.get_groups(X)

get_groups(X) abstractmethod

Get group labels for each sample in X.

Parameters:

Name Type Description Default
X DataFrame

Input features DataFrame.

required

Returns:

Type Description
NDArray[intp]

Array of group labels with same length as X.

Source code in src/xeries/core/base.py
@abstractmethod
def get_groups(self, X: pd.DataFrame) -> NDArray[np.intp]:
    """Get group labels for each sample in X.

    Args:
        X: Input features DataFrame.

    Returns:
        Array of group labels with same length as X.
    """
    ...