← Back to Python

All Topics

Advertisement

Learn/Python/Intermediate Python

Regular Expressions

Topic: Text Processing

Advertisement

Introduction

Regular expressions provide powerful pattern matching capabilities for text processing.

Basic Patterns

import re

# Match at beginning
re.match(r"hello", "hello world")

# Search anywhere
re.search(r"world", "hello world")

# Find all matches
re.findall(r"\d+", "123 abc 456 def 789")

# Split by pattern
re.split(r"\s+", "hello world    python")

Character Classes

# Digit [0-9]
re.findall(r"\d+", "abc123def456")

# Word character [a-zA-Z0-9_]
re.findall(r"\w+", "hello_world!123")

# Whitespace
re.findall(r"\s+", "hello world")

# Negation
re.findall(r"[^aeiou]", "hello")  # Consonants

Quantifiers

# * - zero or more
re.findall(r"ab*c", "ac abc abbc")  # ac, abc, abbc

# + - one or more
re.findall(r"ab+c", "ac abc abbc")  # abc, abbc

# ? - zero or one
re.findall(r"colou?r", "color colour")

# {n,m} - between n and m
re.findall(r"\d{3}-\d{4}", "123-4567 123-45678")

Groups and Substitution

# Capturing groups
pattern = r"(\w+)@(\w+)\.(\w+)"
match = re.match(pattern, "john@google.com")
print(match.group(1))  # john

# Named groups
pattern = r"(?P<user>\w+)@(?P<domain>\w+)"
match = re.match(pattern, "john@gmail.com")
print(match.group("user"))  # john

# Substitution
re.sub(r"\d+", "#", "item1 price2 total3")

Practice Problems

  1. Validate email addresses
  2. Extract phone numbers from text
  3. Replace all URLs with "[LINK]"
  4. Parse log file entries
  5. Build simple tokenizer with regex

Advertisement

Advertisement

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →