← Back to Python

All Topics

Advertisement

Learn/Python/Machine Learning

Scikit-Learn Preprocessing

Topic: Scikit-Learn

Advertisement

Introduction

Scikit-Learn provides preprocessing tools for feature scaling, encoding, and transformation.

Standard Scaling

from sklearn.preprocessing import StandardScaler, MinMaxScaler
import numpy as np

data = np.array([[1, 2], [3, 4], [5, 6]])

# Z-score standardization
scaler = StandardScaler()
scaled = scaler.fit_transform(data)
print(scaled.mean(axis=0))  # [0, 0]
print(scaled.std(axis=0))   # [1, 1]

# Min-Max scaling
minmax = MinMaxScaler()
normalized = minmax.fit_transform(data)

Label Encoding

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
import numpy as np

# Simple label encoding
labels = ['cat', 'dog', 'bird', 'cat', 'dog']
encoder = LabelEncoder()
encoded = encoder.fit_transform(labels)
print(encoded)  # [0 2 1 0 2]

# Inverse transform
decoded = encoder.inverse_transform([0, 1, 2])

# One-hot encoding
onehot = OneHotEncoder(sparse_output=False)
 reshaped = np.array(labels).reshape(-1, 1)
onehot_encoded = onehot.fit_transform(reshaped)

Polynomial Features

from sklearn.preprocessing import PolynomialFeatures
import numpy as np

X = np.array([[1, 2], [3, 4]])

# Create degree 2 polynomial features
poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(X)
print(X_poly.shape)  # (2, 6)
print(poly.get_feature_names_out())

Custom Transformers

from sklearn.preprocessing import FunctionTransformer
import numpy as np

# Log transformation
log_transformer = FunctionTransformer(np.log1p, inverse=np.expm1)
data = np.array([[0, 1], [2, 3]])
transformed = log_transformer.fit_transform(data)

# Square transformation
square_transformer = FunctionTransformer(lambda x: x**2, inverse_func=np.sqrt)

Practice Problems

  1. Scale features using StandardScaler
  2. Encode categorical labels
  3. Create polynomial features
  4. Build custom transformer pipeline
  5. Handle missing values with imputation

Advertisement

Advertisement

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →