← Back to Data Science

All Topics

Advertisement

Learn/Data Science/Machine Learning

Clustering Algorithms

Topic: Unsupervised Learning

Advertisement

Clustering Methods

Clustering finds natural groupings in data without labels.

K-Means Clustering

KMeans from sklearn.cluster. n_clusters sets number of clusters. init='k-means++' improves initialization.

kmeans.fit_predict(X) returns cluster labels. inertia_ gives within-cluster sum of squares.

Elbow method plots inertia vs k to choose optimal clusters.

Hierarchical Clustering

AgglomerativeClustering creates hierarchical clusters. linkage parameter: 'ward', 'complete', 'average'.

Dendrogram visualizes hierarchy. scipy.cluster.hierarchy.dendrogram creates it.

DBSCAN

DBSCAN identifies clusters of arbitrary shape. eps and min_samples control density.

Does not require number of clusters. Identifies outliers as noise.

Key Takeaways

  1. K-means is simple and widely used
  2. Hierarchical clustering reveals structure at multiple scales
  3. DBSCAN handles arbitrary shapes and detects outliers

Advertisement

Advertisement

Need More Practice?

Get personalized data science help from ChatWhole's AI-powered platform.

Get Expert Help →