K-center

In the k-center problem we are given a set of items \(I\), metric \(d\) and number \(k\). We want to select \(k\) clusters' centers (items), in such a way that the maximum distance of items to clusters' centers is minimized.

In geometric terms, our goal is to find the centers of \(k\) different balls of same radius that cover all points so that their radius is as small as possible

More formally: Given set of items \(I\) and metric \(d(v_i,v_j)\), find a subset \(S \subseteq I\) with \(|S|=k\) that minimizes \(max_{v\in I}min_{s\in S}d(v,s)\)

Here we implement the following algorithm:

- We select first item arbitrary, and make it a cluster by putting it into \(S\);
- We find next cluster center by finding an item that is as fa as possible from all other cluster centers and add it to \(S\).

We repeat step 2 until \(|S|=k\). When the algorithm finishes we return clusters in \(S\).

Pseudocode:

\begin{eqnarray*} & & \\ & & \textbf{Pick arbitrary: } {\color{red}i\in \color{blue}I}\\ & & \color{blue}S \leftarrow \{\color{red}i\}\\ & & \textbf{while } |\color{blue}S| < \color{red}k \textbf{ do}\\ & & \hspace{1cm}\color{red}j\leftarrow \mbox{argmax}_{\color{red}j \in \color{blue}I } d(\color{red}j,\color{blue}S)\\ & & \hspace{1cm}\color{blue}S\leftarrow \color{blue}S \cup \{\color{red}j\}\\ \end{eqnarray*}

#include "paal/greedy/k_center/k_center.hpp"

#include "paal/utils/irange.hpp"

#include <iostream>

#include <vector>

int main() {

// sample data

const int parts = 2;

m(0, 1) = 3;

m(0, 2) = 4;

m(1, 2) = 5;

m(1, 0) = 3;

m(2, 0) = 4;

m(2, 1) = 5;

auto vertices = paal::irange(3);

std::vector<int> centers;

// solution

std::cout << paal::greedy::kCenter(m, parts, vertices.begin(),

vertices.end(), back_inserter(centers))

<< std::endl;

}

example file is k_center_example.cpp

IN: *const Metric& * metric

IN: *unsigned int * numberOfClusters

OUT: *OutputIterator * result

IN: *const ItemIterator * iBegin

IN: *const ItemIterator * iEnd

The Items that are selected to be centers will be output to the output iterator result.

The iterator type must be a model of *Output Iterator*

Approximation ratio of this algorithm is equal to 2.

The time complexity of the algorithm is \(O(|I|*k)\), where \(I\) is the number of items and \(k\) is the number of clusters' centers.

The memory complexity of the algorithm is \(O(|I|)\), where \(I\) is the number of items

The algorithm analysis is described in [23]

Generated on Tue Jan 31 2017 00:30:51 by 1.8.5