← Back to Data Science

All Topics

Advertisement

Learn/Data Science/Data Visualization

ggplot2 for Data Visualization

Topic: ggplot2

Advertisement

Introduction

ggplot2 is R's most popular data visualization package. It implements the Grammar of Graphics, providing a systematic way to build visualizations.

Grammar of Graphics

  • Data: The dataset to visualize
  • Aesthetics: Visual properties (x, y, color, size)
  • Geometries: How to draw the data (points, lines, bars)
  • Facets: Split into subplots
  • Themes: Visual styling

Basic ggplot2

library(ggplot2)

# Basic scatter plot
ggplot(data, aes(x = variable1, y = variable2)) +
  geom_point()

# Add aesthetics
ggplot(data, aes(x = variable1, y = variable2, 
                 color = category, size = value)) +
  geom_point()

Geometries

# Scatter plot
ggplot(data, aes(x = x, y = y)) + geom_point()

# Line plot
ggplot(data, aes(x = x, y = y)) + geom_line()

# Bar chart
ggplot(data, aes(x = category, y = value)) + 
  geom_bar(stat = "identity")

# Histogram
ggplot(data, aes(x = numeric_var)) + geom_histogram()

# Box plot
ggplot(data, aes(x = category, y = value)) + 
  geom_boxplot()

# Violin plot
ggplot(data, aes(x = category, y = value)) + 
  geom_violin()

Customization

# Labels and titles
ggplot(data, aes(x = x, y = y)) +
  geom_point() +
  labs(title = "Title",
       subtitle = "Subtitle",
       x = "X Label",
       y = "Y Label",
       color = "Legend")

# Themes
ggplot(data, aes(x = x, y = y)) +
  geom_point() +
  theme_minimal() +
  theme(axis.text = element_text(size = 12))

Facets

# Split by one variable
ggplot(data, aes(x = x, y = y)) +
  geom_point() +
  facet_wrap(~category)

# Split by two variables
ggplot(data, aes(x = x, y = y)) +
  geom_point() +
  facet_grid(row ~ col)

Statistical Transformations

# Add smooth line
ggplot(data, aes(x = x, y = y)) +
  geom_point() +
  geom_smooth(method = "lm")

# Add density
ggplot(data, aes(x = value, fill = category)) +
  geom_density(alpha = 0.5)

Color Scales

# Manual colors
ggplot(data, aes(x = x, y = y, color = category)) +
  geom_point() +
  scale_color_manual(values = c("red", "blue", "green"))

# Gradient for continuous
ggplot(data, aes(x = x, y = y, color = value)) +
  geom_point() +
  scale_color_gradient(low = "blue", high = "red")

Save Plots

ggsave("plot.png", width = 10, height = 8, dpi = 300)
ggsave("plot.pdf")

Key Takeaways

  1. ggplot2 follows the Grammar of Graphics
  2. Build plots layer by layer
  3. Use facets for multi-panel visualizations
  4. Extensive customization options available

Advertisement

Advertisement

Need More Practice?

Get personalized data science help from ChatWhole's AI-powered platform.

Get Expert Help →