← Back to Data Science

All Topics

Advertisement

Learn/Data Science/Python Fundamentals

Python File I/O

Topic: Input/Output

Advertisement

Introduction

Python provides powerful tools for reading from and writing to files, which is essential for data science workflows. Understanding file I/O operations enables loading datasets, saving processed results, and handling various file formats. Python supports both text and binary file operations with various encoding options.

Key Concepts

  • File modes: Read, write, append (text and binary)
  • Context managers: Automatic resource cleanup with 'with' statement
  • Text vs binary: Different handling for text and binary files
  • Encodings: UTF-8, ASCII, and other character encodings
  • Line-by-line processing: Efficient handling of large files
  • CSV and JSON: Common data file formats

Python Implementation

# Basic file reading
with open("data.txt", "r") as file:
    content = file.read()

# Reading lines
with open("data.txt", "r") as file:
    lines = file.readlines()  # List of all lines
    for line in file:         # Iterate line by line
        print(line.strip())

# Writing to files
with open("output.txt", "w") as file:
    file.write("Hello, World!\n")
    file.writelines(["Line 1\n", "Line 2\n"])

# Appending to files
with open("log.txt", "a") as file:
    file.write("New entry\n")

# CSV handling
import csv
with open("data.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerow(["Name", "Age"])
    writer.writerows([["Alice", 25], ["Bob", 30]])

# Reading CSV
with open("data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# JSON handling
import json
data = {"name": "John", "age": 30}
with open("data.json", "w") as file:
    json.dump(data, file)

with open("data.json", "r") as file:
    loaded = json.load(file)

When to Use

  • Loading datasets from disk for analysis
  • Saving processed data and results
  • Reading configuration files
  • Processing log files
  • Working with CSV and JSON data exports
  • Handling large files with streaming

Key Takeaways

  1. Always use context managers (with statement) for file operations to ensure proper cleanup
  2. Specify encoding explicitly when dealing with non-ASCII text
  3. Use newline="" when writing CSV files to avoid double line endings
  4. For large files, process line by line to avoid loading entire file into memory
  5. JSON and CSV are the most common data interchange formats in data science

Advertisement

Advertisement

Need More Practice?

Get personalized data science help from ChatWhole's AI-powered platform.

Get Expert Help →