notes

Personal notes
git clone git://git.laack.co/notes.git
Log | Files | Refs

KMeans.md (748B)


      1 # K-means (Clustering)
      2 
      3 ML CH2
      4 
      5 **Definition:** K-means clustering is a clustering algorithm that clusters data together by finding the mean distance from clusteroids and places said element into said cluster.
      6 
      7 Basic idea:
      8 
      9 1. Select cluster centroids
     10 2. Go through elements finding nearest centroid mean
     11 3. Add item to centroid and update the mean position
     12 4. Repeat Step 2
     13 
     14 When using kmeans clustering it can, at times, find local optimum instead of global optimum. To help with this issue one thing that can be done is passing in a list of starting positions for centroids. 
     15 
     16 Another solution is to run the algorithm multiple times with different random starting positions. We then take the best solution which minimizes [Inertia](Inertia.md).