Classification with Scikit-learn
Scikit-learn provides various classification algorithms with unified interface.
Logistic Regression
LogisticRegression implements logistic regression. fit(X, y) trains. predict(X) predicts classes. predict_proba(X) gives probabilities.
Regularization (C parameter) controls complexity. Multi-class: 'ovr' or 'multinomial'.
Decision Trees
DecisionTreeClassifier creates tree-based classifiers. max_depth limits tree depth. min_samples_split requires minimum samples to split.
Trees are interpretable: feature_importances_ shows feature contributions.
Support Vector Machines
SVC implements SVM. kernel='rbf' for non-linear boundaries. C controls regularization. gamma controls RBF width.
SVMs are powerful but can be slow on large datasets.
Key Takeaways
- Scikit-learn provides many classification algorithms
- Logistic regression handles multi-class well
- Decision trees are interpretable; SVMs are powerful