Scikit-Learn Preprocessing

Topic: Scikit-Learn

Introduction

Scikit-Learn provides preprocessing tools for feature scaling, encoding, and transformation.

Standard Scaling

from sklearn.preprocessing import StandardScaler, MinMaxScaler
import numpy as np

data = np.array([[1, 2], [3, 4], [5, 6]])

# Z-score standardization
scaler = StandardScaler()
scaled = scaler.fit_transform(data)
print(scaled.mean(axis=0))  # [0, 0]
print(scaled.std(axis=0))   # [1, 1]

# Min-Max scaling
minmax = MinMaxScaler()
normalized = minmax.fit_transform(data)

Label Encoding

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
import numpy as np

# Simple label encoding
labels = ['cat', 'dog', 'bird', 'cat', 'dog']
encoder = LabelEncoder()
encoded = encoder.fit_transform(labels)
print(encoded)  # [0 2 1 0 2]

# Inverse transform
decoded = encoder.inverse_transform([0, 1, 2])

# One-hot encoding
onehot = OneHotEncoder(sparse_output=False)
 reshaped = np.array(labels).reshape(-1, 1)
onehot_encoded = onehot.fit_transform(reshaped)

Polynomial Features

from sklearn.preprocessing import PolynomialFeatures
import numpy as np

X = np.array([[1, 2], [3, 4]])

# Create degree 2 polynomial features
poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(X)
print(X_poly.shape)  # (2, 6)
print(poly.get_feature_names_out())

Custom Transformers

from sklearn.preprocessing import FunctionTransformer
import numpy as np

# Log transformation
log_transformer = FunctionTransformer(np.log1p, inverse=np.expm1)
data = np.array([[0, 1], [2, 3]])
transformed = log_transformer.fit_transform(data)

# Square transformation
square_transformer = FunctionTransformer(lambda x: x**2, inverse_func=np.sqrt)

Practice Problems

Scale features using StandardScaler
Encode categorical labels
Create polynomial features
Build custom transformer pipeline
Handle missing values with imputation

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →

All Topics

Scikit-Learn Preprocessing

Introduction

Standard Scaling

Label Encoding

Polynomial Features

Custom Transformers

Practice Problems

Need More Practice?