Partitioners¶

Partitioners create groups/subsets of data for conditional permutation.

ManualPartitioner¶

Use when you have domain knowledge about how series should be grouped.

`ManualPartitioner(mapping, series_col='level')` ¶

Bases: BasePartitioner

Partitioner using a user-defined mapping dictionary.

This partitioner assigns samples to groups based on a predefined mapping from series identifiers (or other categorical values) to group labels. Useful when domain knowledge suggests natural groupings.

Example

mapping = {'MT_001': 'group_A', 'MT_002': 'group_B', 'MT_003': 'group_A'} partitioner = ManualPartitioner(mapping, series_col='level') groups = partitioner.fit_get_groups(X, feature='lag_1')

Initialize the manual partitioner.

Parameters:

Name	Type	Description	Default
`mapping`	`dict[Any, Any]`	Dictionary mapping series identifiers to group labels.	required
`series_col`	`str`	Name of the column or index level containing series IDs.	`'level'`

Source code in src/xeries/partitioners/manual.py

def __init__(
    self,
    mapping: dict[Any, Any],
    series_col: str = "level",
) -> None:
    """Initialize the manual partitioner.

    Args:
        mapping: Dictionary mapping series identifiers to group labels.
        series_col: Name of the column or index level containing series IDs.
    """
    self.mapping = mapping
    self.series_col = series_col
    self._fitted = False
    self._group_encoder: dict[Any, int] = {}

`n_groups` `property` ¶

Return the number of unique groups.

`fit(X, feature)` ¶

Fit the partitioner (encodes group labels to integers).

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame.	required
`feature`	`str`	The feature to condition on (not used for manual partitioner).	required

Returns:

Type	Description
`ManualPartitioner`	Self for method chaining.

Source code in src/xeries/partitioners/manual.py

def fit(self, X: pd.DataFrame, feature: str) -> ManualPartitioner:
    """Fit the partitioner (encodes group labels to integers).

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on (not used for manual partitioner).

    Returns:
        Self for method chaining.
    """
    unique_groups = sorted(set(self.mapping.values()))
    self._group_encoder = {g: i for i, g in enumerate(unique_groups)}
    self._fitted = True
    return self

`get_groups(X)` ¶

Get group labels for each sample based on the mapping.

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame with series identifiers.	required

Returns:

Type	Description
`NDArray[intp]`	Array of integer group labels.

Raises:

Type	Description
`ValueError`	If partitioner has not been fitted.
`KeyError`	If series_col is not found in X.

Source code in src/xeries/partitioners/manual.py

def get_groups(self, X: pd.DataFrame) -> NDArray[np.intp]:
    """Get group labels for each sample based on the mapping.

    Args:
        X: Input features DataFrame with series identifiers.

    Returns:
        Array of integer group labels.

    Raises:
        ValueError: If partitioner has not been fitted.
        KeyError: If series_col is not found in X.
    """
    if not self._fitted:
        raise ValueError("Partitioner must be fitted before calling get_groups")

    series_ids = self._get_series_ids(X)
    group_labels = series_ids.map(self.mapping)

    if group_labels.isna().any():
        missing = series_ids[group_labels.isna()].unique()
        raise ValueError(f"Series IDs not found in mapping: {missing.tolist()}")

    encoded = group_labels.map(self._group_encoder)
    return encoded.to_numpy().astype(np.intp)

TreePartitioner¶

Automatically learns subgroups using a decision tree (cs-PFI algorithm).

`TreePartitioner(max_depth=4, min_samples_leaf=0.05, series_col=None, random_state=None)` ¶

Bases: BasePartitioner

Partitioner using decision tree leaf nodes for subgroup discovery.

This implements the Conditional Subgroup Permutation Feature Importance (cs-PFI) algorithm. A decision tree is trained to predict the feature of interest using all other features. The leaf nodes of this tree define homogeneous subgroups for conditional permutation.

Example

partitioner = TreePartitioner(max_depth=4, min_samples_leaf=0.05) groups = partitioner.fit_get_groups(X, feature='lag_1')

Initialize the tree partitioner.

Parameters:

Name	Type	Description	Default
`max_depth`	`int \| None`	Maximum depth of the decision tree.	`4`
`min_samples_leaf`	`int \| float`	Minimum samples required in a leaf node. Can be int (absolute) or float (fraction of total samples).	`0.05`
`series_col`	`str \| None`	Column with series identifiers to one-hot encode. `None` (default) auto-detects `_level_skforecast` (skforecast 0.21+) or `level` (MultiIndex / legacy). Set to `None` explicitly is the same as omitting detection only when neither column exists.	`None`
`random_state`	`int \| None`	Random seed for reproducibility.	`None`

Source code in src/xeries/partitioners/tree.py

def __init__(
    self,
    max_depth: int | None = 4,
    min_samples_leaf: int | float = 0.05,
    series_col: str | None = None,
    random_state: int | None = None,
) -> None:
    """Initialize the tree partitioner.

    Args:
        max_depth: Maximum depth of the decision tree.
        min_samples_leaf: Minimum samples required in a leaf node.
            Can be int (absolute) or float (fraction of total samples).
        series_col: Column with series identifiers to one-hot encode.
            ``None`` (default) auto-detects ``_level_skforecast`` (skforecast 0.21+)
            or ``level`` (MultiIndex / legacy). Set to ``None`` explicitly is the same
            as omitting detection only when neither column exists.
        random_state: Random seed for reproducibility.
    """
    self.max_depth = max_depth
    self.min_samples_leaf = min_samples_leaf
    self.series_col = series_col
    self.random_state = random_state

    self._tree: DecisionTreeRegressor | None = None
    self._encoder: OneHotEncoder | None = None
    self._feature: str | None = None
    self._fitted = False

`n_groups` `property` ¶

Return the number of leaf nodes (groups).

`tree` `property` ¶

Return the fitted decision tree.

`fit(X, feature)` ¶

Fit the decision tree to predict the feature of interest.

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame.	required
`feature`	`str`	The feature to condition on (will be predicted by tree).	required

Returns:

Type	Description
`TreePartitioner`	Self for method chaining.

Source code in src/xeries/partitioners/tree.py

def fit(self, X: pd.DataFrame, feature: str) -> TreePartitioner:
    """Fit the decision tree to predict the feature of interest.

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on (will be predicted by tree).

    Returns:
        Self for method chaining.
    """
    self._feature = feature

    y_tree = X[feature].values
    X_tree = self._prepare_tree_features(X, feature)

    self._tree = DecisionTreeRegressor(
        max_depth=self.max_depth,
        min_samples_leaf=self.min_samples_leaf,
        random_state=self.random_state,
    )
    self._tree.fit(X_tree, y_tree)
    self._fitted = True

    return self

`get_groups(X)` ¶

Get leaf node indices as group labels.

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame.	required

Returns:

Type	Description
`NDArray[intp]`	Array of leaf node indices (group labels).

Raises:

Type	Description
`ValueError`	If partitioner has not been fitted.

Source code in src/xeries/partitioners/tree.py

def get_groups(self, X: pd.DataFrame) -> NDArray[np.intp]:
    """Get leaf node indices as group labels.

    Args:
        X: Input features DataFrame.

    Returns:
        Array of leaf node indices (group labels).

    Raises:
        ValueError: If partitioner has not been fitted.
    """
    if not self._fitted or self._tree is None or self._feature is None:
        raise ValueError("Partitioner must be fitted before calling get_groups")

    X_tree = self._prepare_tree_features(X, self._feature)
    return self._tree.apply(X_tree).astype(np.intp)

Base Class¶

`BasePartitioner` ¶

Bases: ABC

Abstract base class for data partitioners.

Partitioners create groups/subsets of data for conditional permutation.

`fit(X, feature)` `abstractmethod` ¶

Fit the partitioner to the data.

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame.	required
`feature`	`str`	The feature to condition on.	required

Returns:

Type	Description
`BasePartitioner`	Self for method chaining.

Source code in src/xeries/core/base.py

@abstractmethod
def fit(self, X: pd.DataFrame, feature: str) -> BasePartitioner:
    """Fit the partitioner to the data.

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on.

    Returns:
        Self for method chaining.
    """
    ...

`fit_get_groups(X, feature)` ¶

Fit and return groups in one step.

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame.	required
`feature`	`str`	The feature to condition on.	required

Returns:

Type	Description
`NDArray[intp]`	Array of group labels.

Source code in src/xeries/core/base.py

def fit_get_groups(self, X: pd.DataFrame, feature: str) -> NDArray[np.intp]:
    """Fit and return groups in one step.

    Args:
        X: Input features DataFrame.
        feature: The feature to condition on.

    Returns:
        Array of group labels.
    """
    self.fit(X, feature)
    return self.get_groups(X)

`get_groups(X)` `abstractmethod` ¶

Get group labels for each sample in X.

Parameters:

Name	Type	Description	Default
`X`	`DataFrame`	Input features DataFrame.	required

Returns:

Type	Description
`NDArray[intp]`	Array of group labels with same length as X.

Source code in src/xeries/core/base.py

@abstractmethod
def get_groups(self, X: pd.DataFrame) -> NDArray[np.intp]:
    """Get group labels for each sample in X.

    Args:
        X: Input features DataFrame.

    Returns:
        Array of group labels with same length as X.
    """
    ...

Partitioners¶

ManualPartitioner¶

ManualPartitioner(mapping, series_col='level') ¶

n_groups property ¶

fit(X, feature) ¶

get_groups(X) ¶

TreePartitioner¶

TreePartitioner(max_depth=4, min_samples_leaf=0.05, series_col=None, random_state=None) ¶

n_groups property ¶

tree property ¶

fit(X, feature) ¶

get_groups(X) ¶

Base Class¶

BasePartitioner ¶

fit(X, feature) abstractmethod ¶

fit_get_groups(X, feature) ¶

get_groups(X) abstractmethod ¶

`ManualPartitioner(mapping, series_col='level')` ¶

`n_groups` `property` ¶

`fit(X, feature)` ¶

`get_groups(X)` ¶

`TreePartitioner(max_depth=4, min_samples_leaf=0.05, series_col=None, random_state=None)` ¶

`n_groups` `property` ¶

`tree` `property` ¶

`fit(X, feature)` ¶

`get_groups(X)` ¶

`BasePartitioner` ¶

`fit(X, feature)` `abstractmethod` ¶

`fit_get_groups(X, feature)` ¶

`get_groups(X)` `abstractmethod` ¶