Phil Package API

Phil package.

class phil.CovariateDistributionImputer(n_neighbors: int = 5, missing_values=nan, random_state=None, threshold: float = 1.0, covariance_matrix=None)[source]

Bases: BaseEstimator

Imputer that samples from the conditional distribution P(x_j | x_{-j}) approximated via k-nearest neighbors in the observed covariate space.

fit(X, y) CovariateDistributionImputer[source]
predict(X) ndarray[source]
class phil.DistributionImputer(missing_values=nan, random_state=None, threshold=1.0)[source]

Bases: BaseEstimator

Imputer that samples from empirical observed values.

fit(X, y)[source]
predict(X)[source]
class phil.ECT(config: ECTConfig)[source]

Bases: Magic

configure(**kwargs)[source]
generate(X: List[ndarray]) List[ndarray][source]
class phil.ECTConfig(*, num_thetas: int, radius: float, resolution: int, scale: int, normalize: bool = True, seed: int = 0)[source]

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

normalize: bool
num_thetas: int
radius: float
resolution: int
scale: int
seed: int
class phil.GridGallery[source]

Bases: object

Collection of imputation grids optimized for specific domains.

Citations: - Sampling/Multiverse: Wayland et al. (2025) - https://www.nature.com/articles/s41560-025-01871-0 - Finance: Gu, Kelly, & Xiu (2020) on ML for asset pricing and robust ML portfolios. - Healthcare: Stekhoven & Bühlmann (2011) on MissForest and Chen et al. (2023) on clinical imputation. - Marketing: Anand & Mamidi (2020) / Zhang et al. (2025) on ML for consumer analytics. - Engineering: Thomas & Rajabi (2021) and Idri et al. (2016) on systematic reviews of engineering data.

classmethod get(name: str) ImputationConfig[source]
class phil.ImputationConfig(*, methods: List[str], modules: List[str], grids: List[ParameterGrid], domain_knowledge: DomainKnowledge | None = None)[source]

Bases: BaseModel

Configuration for imputation methods and parameter grids.

domain_knowledge: DomainKnowledge | None
grids: List[ParameterGrid]
methods: List[str]
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modules: List[str]
class phil.Phil(samples: int = 30, param_grid: str = 'default', magic: str = 'ECT', config=None, random_state=None)[source]

Bases: object

fit(df: DataFrame, max_iter: int = 5) DataFrame[source]
generate_descriptors() List[ndarray][source]
impute(df: DataFrame, max_iter: int = 10) List[ndarray][source]
plot_mds(**kwargs)[source]

Visualize the ECT descriptor space via MDS after fit().

transform(df: DataFrame, max_iter: int = 5) DataFrame[source]
class phil.PhilTransformer(samples: int = 30, param_grid: str | ImputationConfig = 'default', magic: str = 'ECT', config: dict | None = None, random_state: int | None = None, max_iter: int = 5)[source]

Bases: BaseEstimator, TransformerMixin

fit(X: DataFrame, y: Any = None) PhilTransformer[source]
transform(X: DataFrame) DataFrame[source]
class phil.PreprocessingConfig(*, method: str, module: str = 'sklearn.preprocessing', params: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Bases: BaseModel

Configuration for data preprocessing steps.

method: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

module: str
params: Dict[str, Any]
phil.plot_mds(descriptors: list[np.ndarray], selected_index: int, ax=None, figsize: tuple[int, int] = (8, 6), random_state: int | None = None) tuple['Figure', np.ndarray][source]

Visualize the ECT descriptor space via Multi-Dimensional Scaling (MDS).

class phil.phil.Phil(samples: int = 30, param_grid: str = 'default', magic: str = 'ECT', config=None, random_state=None)[source]

Bases: object

fit(df: DataFrame, max_iter: int = 5) DataFrame[source]
generate_descriptors() List[ndarray][source]
impute(df: DataFrame, max_iter: int = 10) List[ndarray][source]
plot_mds(**kwargs)[source]

Visualize the ECT descriptor space via MDS after fit().

transform(df: DataFrame, max_iter: int = 5) DataFrame[source]

Scikit-learn compatible transformers for Phil.

class phil.transformers.PhilTransformer(samples: int = 30, param_grid: str | ImputationConfig = 'default', magic: str = 'ECT', config: dict | None = None, random_state: int | None = None, max_iter: int = 5)[source]

Bases: BaseEstimator, TransformerMixin

fit(X: DataFrame, y: Any = None) PhilTransformer[source]
transform(X: DataFrame) DataFrame[source]