Advanced Cross-Validation
Different data require different validation strategies.
Stratified K-Fold
StratifiedKFold(n_splits=5) maintains class proportions. Essential for imbalanced classification.
RepeatedStratifiedKFold repeats for stability.
Group K-Fold
GroupKFold(n_splits) ensures groups stay together. Use when data has group structure.
Prevents data leakage across groups.
Time Series Split
TimeSeriesSplit for temporal data. Each fold uses earlier data for training, later for validation.
Key Takeaways
- Stratified K-Fold maintains class balance
- Group K-Fold prevents leakage across groups
- Time series split respects temporal order