Introduction
The mutate() function creates new columns based on existing ones. It's essential for feature engineering.
Basic Mutate
library(dplyr)
df <- tibble(
x = 1:5,
y = c(10, 20, 30, 40, 50)
)
# Create new column
mutate(df, sum = x + y)
# Multiple new columns
mutate(df,
sum = x + y,
product = x * y,
ratio = y / x)
Using in mutate()
df <- tibble(
name = c("Alice", "Bob", "Charlie"),
score1 = c(85, 90, 78),
score2 = c(80, 95, 88)
)
mutate(df,
average = (score1 + score2) / 2,
difference = score1 - score2,
status = ifelse(average >= 85, "Pass", "Fail"))
Mutate with Window Functions
df <- tibble(
value = c(10, 20, 30, 40, 50)
)
mutate(df,
lag_value = lag(value, 1),
lead_value = lead(value, 1),
cumulative = cumsum(value),
rank = rank(value))
Transmute
# Transmute keeps only new columns
transmute(df,
sum = x + y,
product = x * y)
Summary
mutate() creates new variables from existing data. Use window functions for complex transformations.