← Back to Python

All Topics

Advertisement

Learn/Python/Machine Learning

Pipeline and GridSearchCV

Topic: Scikit-Learn

Advertisement

Introduction

Create reproducible machine learning workflows with pipelines.

Pipeline Basics

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.linear_model import LogisticRegression

pipeline = Pipeline([
    ("scaler", StandardScaler()),
    ("pca", PCA(n_components=2)),
    ("classifier", LogisticRegression())
])

pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)

Pipeline with GridSearch

from sklearn.model_selection import GridSearchCV

pipeline = Pipeline([
    ("scaler", StandardScaler()),
    ("classifier", LogisticRegression())
])

param_grid = {
    "classifier__C": [0.1, 1, 10],
    "scaler": [StandardScaler(), None]
}

grid = GridSearchCV(pipeline, param_grid, cv=5)
grid.fit(X, y)

Column Transformer

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder

preprocessor = ColumnTransformer([
    ("num", StandardScaler(), numerical_features),
    ("cat", OneHotEncoder(), categorical_features)
])

Practice Problems

  1. Build preprocessing pipeline
  2. Combine with GridSearch
  3. Use ColumnTransformer
  4. Access pipeline steps
  5. Create custom transformers

Advertisement

Advertisement

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →