Introduction
Decision trees split data based on feature values to make predictions. They're interpretable and useful for classification and regression.
Building Trees
library(rpart)
# Classification tree
tree <- rpart(target ~ predictors, data = train)
# Print tree
print(tree)
# Plot tree
plot(tree)
text(tree)
Predictions
# Class predictions
predict(tree, test, type = "class")
# Probability predictions
predict(tree, test, type = "prob")
Pruning
# Find optimal complexity
plotcp(tree)
# Prune tree
pruned_tree <- prune(tree, cp = 0.05)
Using Caret
library(caret)
train(target ~ ., data = train, method = "rpart")
Summary
Decision trees are interpretable. Use pruning to avoid overfitting.