← Back to Python

All Topics

Advertisement

Learn/Python/Deep Learning

Optimizers

Topic: Keras

Advertisement

Introduction

Optimizers adjust model parameters to minimize the loss function through gradient-based updates.

Built-in Optimizers

from tensorflow.keras import optimizers

# Adam (most common)
model.compile(optimizer='adam', loss='mse')

# SGD
model.compile(optimizer='SGD()', loss='mse')

# AdamW (decoupled weight decay)
model.compile(optimizer=optimizers.AdamW(weight_decay=0.01), loss='mse')

Optimizer Parameters

# Adam with custom settings
adam = optimizers.Adam(
    learning_rate=0.001,
    beta_1=0.9,
    beta_2=0.999,
    epsilon=1e-07
)

# SGD with momentum
sgd = optimizers.SGD(
    learning_rate=0.01,
    momentum=0.9,
    nesterov=True
)

# RMSprop
rmsprop = optimizers.RMSprop(
    learning_rate=0.001,
    rho=0.9,
    momentum=0.0
)

Learning Rate Scheduling

# Step decay
lr_schedule = optimizers.schedules.StepDecay(
    initial_learning_rate=0.1,
    decay_steps=10000,
    decay_rate=0.5
)

# Exponential decay
lr_schedule = optimizers.schedules.ExponentialDecay(
    initial_learning_rate=0.1,
    decay_steps=10000,
    decay_rate=0.95
)

model.compile(optimizer=optimizers.Adam(lr_schedule), loss='mse')

Gradient Clipping

# Clip by norm
optimizer = optimizers.Adam(clipnorm=1.0)

# Clip by value
optimizer = optimizers.Adam(clipvalue=0.5)

Practice Problems

  1. Compare Adam vs SGD
  2. Tune learning rate
  3. Implement learning rate schedule
  4. Use gradient clipping
  5. Add weight decay

Advertisement

Advertisement

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →