← Back to Data Science

All Topics

Advertisement

Learn/Data Science/Machine Learning Fundamentals

Introduction to Machine Learning

Topic: Introduction

Advertisement

What is Machine Learning?

Machine Learning (ML) is a subset of artificial intelligence that enables computers to learn from data and make decisions without being explicitly programmed.

Types of Machine Learning

  1. Supervised Learning - Learning from labeled data
  2. Unsupervised Learning - Finding patterns in unlabeled data
  3. Reinforcement Learning - Learning through rewards/penalties

Machine Learning Workflow

# 1. Import libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression

# 2. Load data
df = pd.read_csv('data.csv')

# 3. Preprocess
X = df.drop('target', axis=1)
y = df['target']

# 4. Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# 5. Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# 6. Train model
model = LinearRegression()
model.fit(X_train_scaled, y_train)

# 7. Evaluate
score = model.score(X_test_scaled, y_test)
print(f"Model Score: {score}")

# 8. Predict
predictions = model.predict(X_test_scaled)

Key Concepts

  • Features: Input variables (X)
  • Target: Output variable (y)
  • Training: Fitting model to data
  • Testing: Evaluating on unseen data
  • Overfitting: Model learns noise
  • Underfitting: Model is too simple

Bias-Variance Tradeoff

# High bias (underfitting) - too simple
# High variance (overfitting) - too complex
# Goal: Balance between bias and variance

Regularization

from sklearn.linear_model import Ridge, Lasso, ElasticNet

# L2 regularization
ridge = Ridge(alpha=1.0)

# L1 regularization
lasso = Lasso(alpha=1.0)

# L1 + L2
elastic = ElasticNet(alpha=1.0, l1_ratio=0.5)

Key Takeaways

  1. ML learns patterns from data automatically
  2. Choose the right type based on your data
  3. Always split into train/test sets
  4. Watch for overfitting and underfitting

Advertisement

Advertisement

Need More Practice?

Get personalized data science help from ChatWhole's AI-powered platform.

Get Expert Help →