Introduction
Audio processing analyzes and transforms audio signals using libraries like librosa for feature extraction.
Loading Audio
import librosa
import numpy as np
# Load audio file
y, sr = librosa.load('audio.wav')
print(f"Duration: {len(y)/sr:.2f}s, Sample rate: {sr}")
# Load with specific duration
y, sr = librosa.load('audio.wav', sr=22050, offset=0.0, duration=10)
# Generate tone
sr = 22050
tone = librosa.tone(440, sr=sr, duration=1)
Audio Features
import librosa
# Spectrogram
S = librosa.feature.melspectrogram(y=y, sr=sr)
S_dB = librosa.power_to_db(S, ref=np.max)
# MFCC
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
# Chroma
chroma = librosa.feature.chroma_stft(y=y, sr=sr)
# Spectral features
spectral_centroid = librosa.feature.spectral_centroid(y=y, sr=sr)
spectral_rolloff = librosa.feature.spectral_rolloff(y=y, sr=sr)
# Zero crossing rate
zcr = librosa.feature.zero_crossing_rate(y)
Audio Effects
import librosa.effects
# Trim silence
y_trimmed, idx = librosa.effects.trim(y, top_db=20)
# Time stretch
y_stretched = librosa.effects.time_stretch(y, rate=1.5)
# Pitch shift
y_shifted = librosa.effects.pitch_shift(y, sr, n_steps=2)
Audio Synthesis
import numpy as np
# Generate audio
sr = 22050
t = np.linspace(0, 1, sr)
tone = np.sin(2 * np.pi * 440 * t)
# Save audio
import soundfile as sf
sf.write('tone.wav', tone, sr)
Practice Problems
- Load audio files
- Extract MFCC features
- Compute spectrogram
- Apply time stretching
- Save audio to file