Learn/R Programming/Data Manipulation

dplyr Joins

Topic: dplyr

Advertisement

Introduction

dplyr provides join functions to combine data frames. These are similar to SQL joins.

Join Functions

library(dplyr)

df1 <- tibble(id = 1:3, name = c("Alice", "Bob", "Charlie"))
df2 <- tibble(id = c(1, 2, 4), score = c(85, 90, 95))

# Inner join - keep matching
inner_join(df1, df2, by = "id")

# Left join - keep all from left
left_join(df1, df2, by = "id")

# Right join - keep all from right
right_join(df1, df2, by = "id")

# Full join - keep all
full_join(df1, df2, by = "id")

Filtering Joins

# Semi join - keep rows in df1 that match df2
semi_join(df1, df2, by = "id")

# Anti join - keep rows in df1 that don't match df2
anti_join(df1, df2, by = "id")

Multiple Keys

df1 <- tibble(id1 = 1:3, id2 = c("a", "b", "c"), value = 1:3)
df2 <- tibble(id1 = c(1, 2), id2 = c("a", "b"), score = c(85, 90))

left_join(df1, df2, by = c("id1", "id2"))

Summary

Use appropriate join functions to combine datasets based on your analysis needs.

Advertisement

Advertisement

Need More Practice?

Get personalized R programming help from ChatWhole's AI-powered platform.

Get Expert Help →