All Classes Namespaces Files Functions Variables Typedefs Enumerations Friends Macros Pages
K-center

Problem definition.

In the k-center problem we are given a set of items \(I\), metric \(d\) and number \(k\). We want to select \(k\) clusters' centers (items), in such a way that the maximum distance of items to clusters' centers is minimized.

In geometric terms, our goal is to find the centers of \(k\) different balls of same radius that cover all points so that their radius is as small as possible

More formally: Given set of items \(I\) and metric \(d(v_i,v_j)\), find a subset \(S \subseteq I\) with \(|S|=k\) that minimizes \(max_{v\in I}min_{s\in S}d(v,s)\)

Solution

Here we implement the following algorithm:

  1. We select first item arbitrary, and make it a cluster by putting it into \(S\);
  2. We find next cluster center by finding an item that is as fa as possible from all other cluster centers and add it to \(S\).

We repeat step 2 until \(|S|=k\). When the algorithm finishes we return clusters in \(S\).

Pseudocode:

\begin{eqnarray*} & & \\ & & \textbf{Pick arbitrary: } {\color{red}i\in \color{blue}I}\\ & & \color{blue}S \leftarrow \{\color{red}i\}\\ & & \textbf{while } |\color{blue}S| < \color{red}k \textbf{ do}\\ & & \hspace{1cm}\color{red}j\leftarrow \mbox{argmax}_{\color{red}j \in \color{blue}I } d(\color{red}j,\color{blue}S)\\ & & \hspace{1cm}\color{blue}S\leftarrow \color{blue}S \cup \{\color{red}j\}\\ \end{eqnarray*}

Example

#include <iostream>
#include <vector>
int main() {
// sample data
const int parts = 2;
m(0, 1) = 3;
m(0, 2) = 4;
m(1, 2) = 5;
m(1, 0) = 3;
m(2, 0) = 4;
m(2, 1) = 5;
auto vertices = paal::irange(3);
std::vector<int> centers;
// solution
std::cout << paal::greedy::kCenter(m, parts, vertices.begin(),
vertices.end(), back_inserter(centers))
<< std::endl;
}

example file is k_center_example.cpp

Parameters

IN: const Metric& metric

IN: unsigned int numberOfClusters

OUT: OutputIterator result

IN: const ItemIterator iBegin

IN: const ItemIterator iEnd

The Items that are selected to be centers will be output to the output iterator result.

The iterator type must be a model of Output Iterator

Approximation Ratio

Approximation ratio of this algorithm is equal to 2.

Complexity

The time complexity of the algorithm is \(O(|I|*k)\), where \(I\) is the number of items and \(k\) is the number of clusters' centers.

The memory complexity of the algorithm is \(O(|I|)\), where \(I\) is the number of items

References

The algorithm analysis is described in [23]