← Back to Python

All Topics

Advertisement

Learn/Python/Web Development

Web Scraping

Topic: Scraping

Advertisement

Introduction

Web scraping extracts data from websites. Always respect robots.txt and terms of service.

Basic Scraping

import requests
from bs4 import BeautifulSoup

response = requests.get("https://example.com")
soup = BeautifulSoup(response.text, "html.parser")

# Find elements
title = soup.find("h1").text
links = soup.find_all("a")
paragraphs = soup.find_all("p", class_="content")

CSS Selectors

# By CSS selector
elements = soup.select("div.container > p")
first_item = soup.select_one(".item")

# With attributes
images = soup.select('img[alt*="profile"]')

Handling Dynamic Content

# For JavaScript-heavy sites, use Selenium
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://example.com")

# Wait for content
element = driver.find_element(By.CSS_SELECTOR, ".dynamic-content")
content = element.text

driver.quit()

Practice Problems

  1. Extract article titles from news site
  2. Parse table data into DataFrame
  3. Follow pagination to scrape multiple pages
  4. Download images from gallery
  5. Handle login forms with scraping

Advertisement

Advertisement

Need More Practice?

Get personalized Python help from ChatWhole's AI-powered platform.

Get Expert Help →