← Back to Python

All Topics

Advertisement

Learn/Python/Data Science

Pandas Data Manipulation

Topic: Pandas

Advertisement

Introduction

Pandas provides powerful tools for data cleaning, transformation, and analysis.

Adding and Modifying Columns

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})

# Add new column
df["c"] = df["a"] + df["b"]

# Conditional column
df["large"] = df["a"] > 1

# Apply function
df["double"] = df["a"].apply(lambda x: x * 2)

Filtering

# Boolean mask
df[df["a"] > 1]

# Multiple conditions
df[(df["a"] > 1) & (df["b"] < 6)]

# Query method
df.query("a > 1 and b < 6")

Sorting

df.sort_values("a")
df.sort_values("a", ascending=False)
df.sort_values(["a", "b"], ascending=[True, False])

Handling Duplicates

df.drop_duplicates()                    # Remove duplicate rows
df.drop_duplicates(subset=["a"])       # Based on column
df.duplicated().sum()                   # Count duplicates

Practice Problems

  1. Filter DataFrame by multiple criteria
  2. Add computed columns
  3. Sort by multiple columns
  4. Remove duplicates intelligently
  5. Use apply with custom functions

Advertisement

Advertisement

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →