Jump to content Jump to search

FLS++

An implementation of the FLS++ algorithm for k-means Clustering, as presented in [1]. This is an improvement of the LS++ algorithm presented by Lattanzi and Sohler in [2].

One can think of the algorithm as working in 3 phases:

  1. Initializing centers with k-means++ (as presented in [3]).
  2. Improving the center set by performing local search swaps (for details, see [1]).
  3. Converging the solution with LLoyd's algorithm (also known as "the" k-means algorithm).

In [2] it is shown that O(k log log k) local search iterations yield a constant factor approximation. However, in both [1] and [2] it is shown that, in practice, a very small number of iterations (e.g. 20) already yields very good results, at very reasonable runtime.

The interface is built in the same way as scikit-learn's KMeans for better compatibility.

Getting FLS++

Please get the newest version from our GitHub repository.

References

[1] Theo Conrads, Lukas Drexler, Joshua Könen, Daniel R. Schmidt and Melanie Schmidt. "Local Search k-means++ with Foresight" (2024)

[2] Silvio Lattanzi and Christian Sohler. A better k-means++ algorithm via local search. In Proc.444 of the 36th ICML, volume 97 of Proceedings of Machine Learning Research, pages 3662–3671.445 PMLR, 09–15 Jun 2019

[3] David Arthur and Sergei Vassilvitskii. K-means++: The advantages of careful seeding. In409 Proceedings of the 18th SODA, page 1027–1035, USA, 2007