Vertex k-center problem

From HandWiki


The vertex k-center problem is a classical NP-hard problem in computer science. It has application in facility location and clustering.[1][2] Basically, the vertex k-center problem models the following real problem: given a city with [math]\displaystyle{ n }[/math] facilities, find the best [math]\displaystyle{ k }[/math] facilities where to build fire stations. Since firemen must attend any emergency as quickly as possible, the distance from the farthest facility to its nearest fire station has to be as small as possible. In other words, the position of the fire stations must be such that every possible fire is attended as quickly as possible.

Formal definition

The vertex k-center problem is a classical NP-Hard problem in computer science. It was first proposed by Hakimi in 1964.[3] Formally, the vertex k-center problem consists in: given a complete undirected graph [math]\displaystyle{ G=(V,E) }[/math] in a metric space, and a positive integer [math]\displaystyle{ k }[/math], find a subset [math]\displaystyle{ C \subseteq V }[/math] such that [math]\displaystyle{ |C|\le k }[/math] and the objective function [math]\displaystyle{ r(C)=\max_{v \in V}\{d(v,C)\} }[/math] is minimized. The distance [math]\displaystyle{ d(v,C) }[/math] is defined as the distance from the vertex [math]\displaystyle{ v }[/math] to its nearest center in [math]\displaystyle{ C }[/math].

Approximation algorithms

If [math]\displaystyle{ P \neq NP }[/math], the vertex k-center problem can not be (optimally) solved in polynomial time. However, there are some polynomial time approximation algorithms that get near-optimal solutions. Specifically, 2-approximated solutions. Actually, if [math]\displaystyle{ P \neq NP }[/math] the best possible solution that can be achieved by a polynomial time algorithm is a 2-approximated one.[4][5][6][7] In the context of a minimization problem, such as the vertex k-center problem, a 2-approximated solution is any solution [math]\displaystyle{ C' }[/math] such that [math]\displaystyle{ r(C') \le 2 \times r(\text{OPT}) }[/math], where [math]\displaystyle{ r(\text{OPT}) }[/math] is the size of an optimal solution. An algorithm that guarantees to generate 2-approximated solutions is known as a 2-approximation algorithm. The main 2-approximated algorithms for the vertex k-center problem reported in the literature are the Sh algorithm,[8] the HS algorithm,[7] and the Gon algorithm.[5][6] Even though these algorithms are the (polynomial) best possible ones, their performance on most benchmark datasets is very deficient. Because of this, many heuristics and metaheuristics have been developed through the time. Contrary to common sense, one of the most practical (polynomial) heuristics for the vertex k-center problem is based on the CDS algorithm, which is a 3-approximation algorithm[9]

The Sh algorithm

Formally characterized by David Shmoys in 1995,[8] the Sh algorithm takes as input a complete undirected graph [math]\displaystyle{ G=(V,E) }[/math], a positive integer [math]\displaystyle{ k }[/math], and an assumption [math]\displaystyle{ r }[/math] on what the optimal solution size is. The Sh algorithm works as follows: selects the first center [math]\displaystyle{ c_1 }[/math] at random. So far, the solution consists of only one vertex, [math]\displaystyle{ C=\{c_1\} }[/math]. Next, selects center [math]\displaystyle{ c_2 }[/math] at random from the set containing all the vertices whose distance from [math]\displaystyle{ C }[/math] is greater than [math]\displaystyle{ 2 \times r }[/math]. At this point, [math]\displaystyle{ C=\{c_1,c_2\} }[/math]. Finally, selects the remaining [math]\displaystyle{ k-2 }[/math] centers the same way [math]\displaystyle{ c_2 }[/math] was selected. The complexity of the Sh algorithm is [math]\displaystyle{ O(kn) }[/math], where [math]\displaystyle{ n }[/math] is the number of vertices.

The HS algorithm

Proposed by Dorit Hochbaum and David Shmoys in 1985, the HS algorithm takes the Sh algorithm as basis.[7] By noticing that the value of [math]\displaystyle{ r(\text{OPT}) }[/math] must equals the cost of some edge in [math]\displaystyle{ E }[/math], and since there are [math]\displaystyle{ O(n^2) }[/math] edges in [math]\displaystyle{ E }[/math], the HS algorithm basically repeats the Sh algorithm with every edge cost. The complexity of the HS algorithm is [math]\displaystyle{ O(n^4) }[/math]. However, by running a binary search over the ordered set of edge costs, its complexity is reduced to [math]\displaystyle{ O(n^2 \log n) }[/math].

The Gon algorithm

Proposed independently by Teofilo Gonzalez,[5] and by Martin Dyer and Alan Frieze[6] in 1985, the Gon algorithm is basically a more powerful version of the Sh algorithm. While the Sh algorithm requires a guess [math]\displaystyle{ r }[/math] on [math]\displaystyle{ r(\text{OPT}) }[/math], the Gon algorithm prescinds from such guess by noticing that if any set of vertices at distance greater than [math]\displaystyle{ 2 \times r(\text{OPT}) }[/math] exists, then the farthest vertex must be inside such set. Therefore, instead of computing at each iteration the set of vertices at distance greater than [math]\displaystyle{ 2 \times r }[/math] and then selecting a random vertex, the Gon algorithm simply selects the farthest vertex from every partial solution [math]\displaystyle{ C' }[/math]. The complexity of the Gon algorithm is [math]\displaystyle{ O(kn) }[/math], where [math]\displaystyle{ n }[/math] is the number of vertices.

The CDS algorithm

Proposed by García Díaz et al. in 2017,[9] the CDS algorithm is a 3-approximation algorithm that takes ideas from the Gon algorithm (farthest point heuristic), the HS algorithm (parametric pruning), and the relationship between the vertex k-center problem and the Dominating Set problem. The CDS algorithm has a complexity of [math]\displaystyle{ O(n^4) }[/math]. However, by performing a binary search over the ordered set of edge costs, a more efficiente heuristic named CDSh is proposed. The CDSh algorithm complexity is [math]\displaystyle{ O(n^2 \log n) }[/math]. Despite the suboptimal performance of the CDS algorithm, and the heuristic performance of CDSh, both present a much better performance than the Sh, HS, and Gon algorithms.

Parameterized approximations

It can be shown that the k-Center problem is W[2]-hard to approximate within a factor of 2 − ε for any ε > 0, when using k as the parameter.[10] This is also true when parameterizing by the doubling dimension (in fact the dimension of a Manhattan metric), unless P=NP.[11] When considering the combined parameter given by k and the doubling dimension, k-Center is still W[1]-hard but it is possible to obtain a parameterized approximation scheme.[12] This is even possible for the variant with vertex capacities, which bound how many vertices can be assigned to an opened center of the solution.[13]

Experimental comparison

Some of the most widely used benchmark datasets for the vertex k-center problem are the pmed instances from OR-Lib.,[14] and some instances from TSP-Lib.[15] Table 1 shows the mean and standard deviation of the experimental approximation factors of the solutions generated by each algorithm over the 40 pmed instances from OR-Lib[9]

Table 1. Experimental approximation factor over pmed instances from OR-Lib
Algorithm [math]\displaystyle{ \mu }[/math] [math]\displaystyle{ \sigma }[/math] Complexity
HS 1.532 0.175 [math]\displaystyle{ O(n^2 \log n) }[/math]
Gon 1.503 0.122 [math]\displaystyle{ O(kn) }[/math]
CDSh 1.035 0.031 [math]\displaystyle{ O(n^2 \log n) }[/math]
CDS 1.020 0.027 [math]\displaystyle{ O(n^4) }[/math]
Table 2. Experimental approximation factor over instances from TSP-Lib
Algorithm [math]\displaystyle{ \mu }[/math] [math]\displaystyle{ \sigma }[/math] Algorithm
Gon 1.396 0.091 [math]\displaystyle{ O(kn) }[/math]
HS 1.318 0.108 [math]\displaystyle{ O(n^2 \log n) }[/math]
CDSh 1.124 0.065 [math]\displaystyle{ O(n^2 \log n) }[/math]
CDS 1.042 0.038 [math]\displaystyle{ O(n^4) }[/math]

Polynomial heuristics

Greedy pure algorithm

The greedy pure algorithm (or Gr) follows the core idea of greedy algorithms: to take optimal local decisions. In the case of the vertex k-center problem, the optimal local decision consists in selecting each center in such a way that the size of the solution (covering radius) is minimum at each iteration. In other words, the first center selected is the one that solves the 1-center problem. The second center selected is the one that, along with the previous center, generates a solution with minimum covering radius. The remaining centers are selected the same way. The complexity of the Gr algorithm is [math]\displaystyle{ O(kn^2) }[/math].[16] The empirical performance of the Gr algorithm is poor on most benchmark instances.

Scoring algorithm

The Scoring algorithm (or Scr) was introduced by Jurij Mihelič and Borut Robič in 2005.[17] This algorithm takes advantage of the reduction from the vertex k-center problem to the minimum dominating set problem. The problem is solved by pruning the input graph with every possible value of the optimal solution size and then solving the minimum dominating set problem heuristically. This heuristic follows the lazy principle, which takes every decision as slow as possible (opossed to the greedy strategy). The complexity of the Scr algorithm is [math]\displaystyle{ O(n^4) }[/math]. The empirical performance of the Scr algorithm is very good on most benchmark instances. However, its running time rapidly becomes impractical as the input grows. So, it seems to be a good algorithm only for small instances.

References

  1. Pacheco, Joaquín A.; Casado, Silvia (December 2005). "Solving two location models with few facilities by using a hybrid heuristic: a real health resources case". Computers & Operations Research 32 (12): 3075–3091. doi:10.1016/j.cor.2004.04.009. ISSN 0305-0548. 
  2. Kaveh, A.; Nasr, H. (August 2011). "Solving the conditional and unconditional -center problem with modified harmony search: A real case study". Scientia Iranica 18 (4): 867–877. doi:10.1016/j.scient.2011.07.010. ISSN 1026-3098. 
  3. Hakimi, S. L. (1964). "Optimum Locations of Switching Centers and the Absolute Centers and Medians of a Graph". Operations Research 12 (3): 450–459. doi:10.1287/opre.12.3.450. 
  4. Kariv, O.; Hakimi, S. L. (December 1979). "An Algorithmic Approach to Network Location Problems. I: The p-Centers". SIAM Journal on Applied Mathematics 37 (3): 513–538. doi:10.1137/0137040. ISSN 0036-1399. 
  5. 5.0 5.1 5.2 Gonzalez, Teofilo F. (1985). "Clustering to minimize the maximum intercluster distance". Theoretical Computer Science 38: 293–306. doi:10.1016/0304-3975(85)90224-5. ISSN 0304-3975. 
  6. 6.0 6.1 6.2 Dyer, M.E; Frieze, A.M (February 1985). "A simple heuristic for the p-centre problem". Operations Research Letters 3 (6): 285–288. doi:10.1016/0167-6377(85)90002-1. ISSN 0167-6377. 
  7. 7.0 7.1 7.2 "A Best Possible Heuristic for the k-Center Problem". Mathematics of Operations Research 10 (2): 180–184. May 1985. doi:10.1287/moor.10.2.180. ISSN 0364-765X. 
  8. 8.0 8.1 Shmoys, David B. (1995). "Computing near-optimal solutions to combinatorial optimization problems". Combinatorial Optimization. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. 20. pp. 355––397. doi:10.1090/dimacs/020/07. ISBN 9780821802397. 
  9. 9.0 9.1 9.2 Garcia-Diaz, Jesus; Sanchez-Hernandez, Jairo; Menchaca-Mendez, Ricardo; Menchaca-Mendez, Rolando (2017-07-01). "When a worse approximation factor gives better performance: a 3-approximation algorithm for the vertex k-center problem". Journal of Heuristics 23 (5): 349–366. doi:10.1007/s10732-017-9345-x. ISSN 1381-1231. 
  10. Feldmann, Andreas Emil (2019-03-01). "Fixed-Parameter Approximations for k-Center Problems in Low Highway Dimension Graphs" (in en). Algorithmica 81 (3): 1031–1052. doi:10.1007/s00453-018-0455-0. ISSN 1432-0541. https://doi.org/10.1007/s00453-018-0455-0. 
  11. Feder, Tomás; Greene, Daniel (1988-01-01). "Optimal algorithms for approximate clustering". Proceedings of the twentieth annual ACM symposium on Theory of computing - STOC '88. STOC '88. New York, NY, USA: Association for Computing Machinery. pp. 434–444. doi:10.1145/62212.62255. ISBN 978-0-89791-264-8. https://doi.org/10.1145/62212.62255. 
  12. Feldmann, Andreas Emil; Marx, Dániel (2020-07-01). "The Parameterized Hardness of the k-Center Problem in Transportation Networks" (in en). Algorithmica 82 (7): 1989–2005. doi:10.1007/s00453-020-00683-w. ISSN 1432-0541. https://doi.org/10.1007/s00453-020-00683-w. 
  13. Feldmann, Andreas Emil; Vu, Tung Anh (2022). "Generalized $$k$$-Center: Distinguishing Doubling and Highway Dimension". in Bekos, Michael A.; Kaufmann, Michael (in en). Graph-Theoretic Concepts in Computer Science. Lecture Notes in Computer Science. 13453. Cham: Springer International Publishing. pp. 215–229. doi:10.1007/978-3-031-15914-5_16. ISBN 978-3-031-15914-5. https://link.springer.com/chapter/10.1007/978-3-031-15914-5_16. 
  14. Beasley, J. E. (1990). "OR-Library: Distributing Test Problems by Electronic Mail". The Journal of the Operational Research Society 41 (11): 1069–1072. doi:10.2307/2582903. 
  15. Reinelt, Gerhard (November 1991). "TSPLIB—A Traveling Salesman Problem Library". ORSA Journal on Computing 3 (4): 376–384. doi:10.1287/ijoc.3.4.376. ISSN 0899-1499. 
  16. Rana, Rattan; Garg, Deepak (March 2009). "Heuristic Approaches for K-Center Problem". 2009 IEEE International Advance Computing Conference. IEEE. pp. 332–335. doi:10.1109/iadcc.2009.4809031. ISBN 9781424429271. 
  17. Mihelič, Jurij; Robič, Borut (2005). "Solving the k-center Problem Efficiently with a Dominating Set Algorithm". Journal of Computing and Information Technology 13 (3): 225. doi:10.2498/cit.2005.03.05. ISSN 1330-1136.