What is Machine Learning?
Machine Learning (ML) is a subset of artificial intelligence that enables computers to learn from data and make decisions without being explicitly programmed.
Types of Machine Learning
- Supervised Learning - Learning from labeled data
- Unsupervised Learning - Finding patterns in unlabeled data
- Reinforcement Learning - Learning through rewards/penalties
Machine Learning Workflow
# 1. Import libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
# 2. Load data
df = pd.read_csv('data.csv')
# 3. Preprocess
X = df.drop('target', axis=1)
y = df['target']
# 4. Split data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 5. Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# 6. Train model
model = LinearRegression()
model.fit(X_train_scaled, y_train)
# 7. Evaluate
score = model.score(X_test_scaled, y_test)
print(f"Model Score: {score}")
# 8. Predict
predictions = model.predict(X_test_scaled)
Key Concepts
- Features: Input variables (X)
- Target: Output variable (y)
- Training: Fitting model to data
- Testing: Evaluating on unseen data
- Overfitting: Model learns noise
- Underfitting: Model is too simple
Bias-Variance Tradeoff
# High bias (underfitting) - too simple
# High variance (overfitting) - too complex
# Goal: Balance between bias and variance
Regularization
from sklearn.linear_model import Ridge, Lasso, ElasticNet
# L2 regularization
ridge = Ridge(alpha=1.0)
# L1 regularization
lasso = Lasso(alpha=1.0)
# L1 + L2
elastic = ElasticNet(alpha=1.0, l1_ratio=0.5)
Key Takeaways
- ML learns patterns from data automatically
- Choose the right type based on your data
- Always split into train/test sets
- Watch for overfitting and underfitting